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Preface 



The Foundations of Software Technology and Theoretical Computer Science con- 
ference (FST TCS) is a well-established annual event in the theoretical computer 
science community. The conference provides a forum for researchers to present 
interesting new results in several areas of theoretical computer science. The con- 
ference is now in its twentieth year and has continued to attract high-quality 
submissions and reputed invited speakers. 

This year’s conference attracted 141 submissions (of which 5 were withdrawn) 
from over 25 countries. Each submission was reviewed by at least three referees, 
with most receiving more than four reviews. The Program Committee met in 
New Delhi on 5 and 6 August 2000, with many members participating electron- 
ically over the Internet. The discussions continued over the Internet for several 
days and we finally selected 36 papers for presentation at the conference and 
inclusion in the proceedings. We thank the Program Committee for their su- 
perlative efforts in finding top quality reviewers and working extremely hard to 
ensure the quality of the conference. We also thank all our reviewers for provid- 
ing detailed and informative feedback about the papers. Rich Gerber’s START 
program greatly simplified managing the submissions and the PC work. 

We are grateful to our six invited speakers, Peter Buneman, Bernard Chazelle, 
Allen Emerson, Martin Grotschel, Jose Meseguer, and Philip Wadler for agree- 
ing to speak at the conference and for providing written contributions that are 
included in the proceedings. 

With the main conference this year, there are two satellite workshops — on 
Recent Advances in Programming Languages and on Computational Geometry. 

FST TCS is being hosted this year by the Indian Institute of Technology, 
Delhi, after a hiatus of eight years. We thank the Organizing Committee for 
their efforts and several others who have been generous with their time and 
energy in assisting us — Mohammed Sarwat, Kunal Talwar, Surender Baswana, 
Rohit Khandekar, and Harsh Nanda, in particular. 

We express our gratitude for the financial and other support from the various 
sponsors, IBM India Research Laboratory, New Delhi, in particular. We also 
thank TCS (TRDDC), Tata Infotech, Silicon Automation Systems, Cadence 
Design Systems, IIT Delhi, and others. 

We also thank the staff at Springer- Verlag, especially Alfred Hofmann, for 
making the production of these proceedings flow very smoothly. 
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Model Checking: 
Theory into Practice 



E. Allen Emerson* 

Department of Computer Sciences and Computer Engineering Research Center 
The University of Texas at Austin, Austin TX-78712, USA 
emersonScs .utexas . edu 
http: //www. cs .utexas . edu/ user s/emer son/ 



Abstract. Model checking is an automatic method for verifying correct- 
ness of reactive programs. Originally proposed as part of the disserta- 
tion work of the author, model checking is based on efficient algorithms 
searching for the presence or absence of temporal patterns. In fact, model 
checking rests on a theoretical foundation of basic principles from modal 
logic, lattice theory, as well as automata theory that permits program 
reasoning to be completely automated in principle and highly automated 
in practice. Because of this automation, the practice of model checking 
is nowadays well-developed, and the range of successful applications is 
growing. Model checking is used by most major hardware manufactur- 
ers to verify microprocessor circuits, while there have been promising 
advances in its use in software verification as well. The key obstacle to 
applicability of model checking is, of course, the state explosion problem. 
This paper discusses part of our ongoing research program to limit state 
explosion. The relation of theory to practice is also discussed. 



1 Introduction 

There is a chronic need for more effective methods of constructing correct and 
reliable computer software as well as hardware. This need is especially press- 
ing for concurrent, distributed, real-time, or, more generally, reactive systems, 
which can exhibit, unpredictably, any one of an immense set of possible ongoing, 
ideally infinite, behaviors. Many safety critical and economically essential real- 
world applications are reactive systems. Examples include: computer operating 
systems, computer network communication protocols, on-board avionics control 
systems, microprocessors, and even the internet. 

The traditional approach to program verification involves the use of axioms, 
and inference rules, together with hand proofs of correctness. Because of the in- 
herent difficulty and sheer tediousness of manual proof construction, this manual, 
proof-theoretic strategy has, by-and-large, fallen from favor. While this approach 
has introduced fundamental principles such as coherent design and compositional 
proofs, the evidence suggests that it does not scale to real systems. 

* This work was supported in part by NSF grant CCR-980-4736 and TARP project 
003658-0650-1999. 
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We have developed an alternative model-theoretic strategy for program rea- 
soning (cf. [7]). The informal motivation is that, while manual proofs are not 
feasible, it ought to be possible to devise fully automated design and reason- 
ing methods using the basic model theory of modal temporal logic. Temporal 
logic, a tensed form of modal logic, has been shown to provide a most useful 
approach to specifying and reasoning about the complex and ongoing behavior 
reactive systems [27]. The linear temporal logic LTL provides modalities corre- 
sponding to natural language tenses such as Fp (eventually p), Gp (henceforth 
p), Xp (nexttime p), and pUq (eventually q and p until then) that are well-suited 
to describing ongoing behavior along a (discrete) timeline corresponding to an 
execution sequence of a reactive program. Additional expressiveness is gained 
through use of the path quantifiers A (for all futures) and E (for some future) 
in the branching time logics CTL and CTL* (cf. [9]). 

The automation of model-theoretic reasoning is permitted by the fact that 
core decision problems for temporal and classical modal logic are decidable. This 
provides a means, in principle, of letting programs reason about programs. Thus, 
it is significant that the following key problems associated with (propositional) 
temporal logic are decidable, in some cases efficiently. First, there is the sat- 
isfiability problem: given temporal logic specification /, does it have a model 
Ml Second, there is the model checking problem: Is candidate Kripke structure 
model M a genuine model of temporal specification fl. Satisfiability asks if / is 
realizable in some model, and is useful in automatic program synthesis. Model 
checking decides if / is true in a particular model, and caters for automatic 
program verification. 

We introduced model checking as an algorithmic method of verifying cor- 
rectness of finite state reactive programs [7] (cf. [4], [6], [30]). Model checking 
has turned out to be one of the more useful approaches to verification of reac- 
tive systems, encompassing not only finite state but some finitely represented, 
infinite state systems as well. Our original approach is based on fixpoint compu- 
tation algorithms to efficiently determine if a given finite state transition graph 
defines a model of correctness specification given in the temporal logic CTL. 
An advantage of this algorithmic approach is that it caters for both verification 
and debugging. The latter is particularly valuable in the early stages of systems 
development when errors predominate and is widely used in industrial applica- 
tions. Symbolic model checking, which equals our original fixpoint based model 
checking algorithm plus the data structures for symbolic state graph represen- 
tation using binary decision diagrams [3] (cf. [24]), is now a standard industrial 
tool for hardware verification. 

The remainder of the paper is organized as follows. In section 2 we describe 
some recent technical advances in limiting state explosion. In section 3 we discuss 
factors relevant to the transition of model checking from theory into practice. 
Some closing remarks are given in section 4. 
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2 Limiting State Explosion 

The most common calls from industrial users of formal verification tools are for 
(a) increased capacity of the tool to handle really large programs/designs, and 
for (b) increased automation. Model checking is especially popular in industrial 
usage because it is fully automated in principle. However, model checking’s ca- 
pacity is still limited. Model checking’s capacity and utility are limited primarily 
by space complexity, and secondarily by time complexity. It is important that 
(the representation of) the state graph of the program being verified fit within 
the main memory of the computer on which the model checking tool is running, 
so as to avoid the slow down of many orders of magnitude which occurs when 
paging to disk. Of course, the underlying theoretical vexation is the well-known 
problem of combinatorial state explosion: e.g., if each sequential process of a par- 
allel system has just 10 local states, a system with 1000 processes, could have 
as many as 10^°°° global system states. A better handle on state explosion is 
essential to increasing the capacity of model checking tools. Thus, a good deal of 
effort has been devoted to limiting state explosion and making model checking 
more space efficient. 

The technique of abstraction is central to the effective management of state 
explosion. In general terms, abstraction means the replacement of an intractably 
large system by a much smaller, abstract system through suppression of inessen- 
tial detail and elimination of redundant information. The abstracted system 
should still preserve relevant information about the original system, and a de- 
sirable attribute is that the abstracted system should be equivalent in some 
appropriate sense to the original system. In the sequel, we will describe some of 
our work on abstraction utilizing the regularity of structure present in many re- 
active systems composed of multiple similar subcomponents, and mention some 
related open problems. 

2.1 Symmetry 

Symmetry Quotient Reduction. One useful approach to abstraction exploits the 
symmetry inherent in many concurrent systems composed of multiple inter- 
changeable subprocesses or subcomponents. Using symmetry we have been able 
to verify a resource controller system with 150 processes and about 10"^^ states 
in roughly an hour on a Sparc. We developed a “group-theoretic” approach to 
symmetry reduction in [14]. The global state transition graph M of such sys- 
tems often exhibits a great deal of symmetry, characterized by the the group of 
graph automorphisms of M . The basic idea is to reduce model checking over the 
original, intractably large structure M to model checking over the smaller quo- 
tient structure M, where symmetric states are identified. The quotient graph 
can be exponentially smaller. Technically, let Q be any group contained in 
Aut M n Auto /, where Aut M is the group of permutations of process in- 
dices defining automorphisms of M and Auto f is the group of permutations 
that leave / and crucial subformulas thereof invariant. We define an equivalence 
relation on states of M so that s =g t iS t = 7r(s) for some permutation tt € G. 
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The quotient M is obtained by identifying states that are =g-equivalent. Then, 
because Q respects both the symmetry of the structure M and the (“internal”) 
symmetry of the CTL* specification formula /, we can show M, s \= f iff 
M, s \= f, where s represents the equivalence class [s] of states symmetric to 
s. Since it turns out, in many practical cases, to be both easy and efficient to 
construct the quotient, we have reduced the problem of model checking CTL* 
properties over an intractably large structure to model checking over a relatively 
small structure, with the proviso that the symmetry of the structure is appro- 
priately respected by the specification. Other work on symmetry may be found 
in, e.g., [21], [19], [5]. 

Annotated Symmetry Quotient Reductions. One can also “trade group theory 
for automata theory” . In [15] we introduced a powerful alternative method. This 
method is also more uniform in that it permits use of a single annotated quotient 
M = M/Aut M for model checking for any specification /, without computing 
and intersecting with Auto f. The idea is to augment the quotient with “guides” , 
indicating how coordinates are permuted from one state to the next in the quo- 
tient. An automaton for / designed to run over paths through M , can be modified 
into another automaton run over M using the guides to keep track of shifting co- 
ordinates. This automata-theoretic method is much more more expressive than 
the purely “group-theoretic approach” above. In particular, we can show that 
the automata-theoretic approach makes it possible to efficiently handle reasoning 
under fairness assumption, unlike the purely “group-theoretic” approach above. 
Another augmented “model-theoretic” approach is introduced in [16], accom- 
modating both fairness and discrete real-time in the uniform framework of the 
Mu-calculus. 

Open problems related to quotient symmetry reduction. Despite the expres- 
sive power of annotated symmetry quotient reduction, efficiently handling of 
general specifications in which all process indices participate along a path still 
seems a difficult one. It would be interesting, and useful for hardware verifi- 
cation, to see if the work on fair scheduling (e.g., GFexi A . . .GFeXn ) could 
be adapted to efficiently handle, e.g., the tighter requirement of round-robin 
scheduling {{ex\] ex 2 ', ■ . ■ ea;„)“) ) of an otherwise fully symmetric system. Such 
a composite seems to possess only cyclic symmetry, thereby limiting compres- 
sion. 

Simple Symmetry. Another important type of syntactic reasoning based on 
symmetry we call simple symmetry (originally dubbed state symmetry in [14]). 
It simplifies reasoning based on the symmetry of the specification, the structure, 
and individual states, and does not entail calculating a quotient structure. For in- 
stance, with an appropriately symmetric start state and with appropriate global 
symmetry, an individual process (or component) is correct iff all processes are 
correct. It permits “quantifier inflation” and “quantifier reduction”: M, s ^ /i iff 
M, s [= Ai /*> provided all fj are identical up to re-indexing. Quantifier inflation 
via simple symmetry was used in [26] to facilitate verification of memory arrays, 
and more recently to radically reduce case analysis in [25]. In [16] we provided 
some powerful instances of the application of simple symmetry. Model checking 
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E{FP\ A ... A FPn) is an NP-complete problem, apparently requiring time ex- 
ponential in n in the worst case. We use simple symmetry to show that for a 
system that (a) is fully symmetric; (b) has fully symmetric start state sq; and 
(c) is already known to be resettable so that AGEFsq holds, the above property 
then amounts to EFP\, which is in polynomial time. This can be quite useful, 
since many systems do possess such symmetry and re-settability is a common 
requirement for many hardware circuits and embedded systems anyway. Many 
more useful applications of simple symmetry are possible. 

Approximate Symmetry. In conversations we have had with industrial hard- 
ware engineers, it comes out that while symmetry reduction is often applicable 
due to the presence of many similar subcomponents, there are also many in- 
stances where it is not — quite — applicable. That is, the systems are not 
genuinely symmetric but “approximately” symmetric, for example, because of 
one different component or slight differences among all components. This limits 
the scope of utility of symmetry reduction techniques. 

In [17] and [10] we proposed and formalized three progressively “looser” no- 
tions of approximate symmetry: near symmetry, rough symmetry, and, finally, 
virtually symmetry. Each can be applied to do group-theoretic quotient reduc- 
tion to various asymmetric systems. The correspondence established in each case 
between the original large structure and the small quotient structure is exact, a 
bisimulation (up to permutation), in fact. Near symmetry can permit symmetry 
reduction on systems comprised of 2 similar but not identical processes. Rough 
symmetry can accommodate, e.g., for fixed fc > 2, systems with k similar but not 
identical process, prioritized statically. Rough symmetry can handle the read- 
ers writers problem for k readers and £ writers. Virtual symmetry is yet more 
general and can accommodate dynamically varying priorities. 

Additional open problems related to symmetry. There are number of impor- 
tant unsolved problems here. One is to broaden the scope of approximate sym- 
metry reduction as much as possible. Obviously, a system that is “absolutely” 
asymmetric is going to lack sufficient regularity and redundancy to permit iden- 
tification of approximate symmetries for reduction. Finding a sufficiently broad 
notion of approximate symmetry that is flexible enough to cover every useful 
applications is open, and perhaps an inherently ill-posed problem. But a prac- 
tical solution is highly desirable. One possibility is to formulate a very general 
notion of approximate symmetry parameterized by degree of divergence from 
actual symmetry. We might look for a notion that is universally applicable in 
principle, but which (a) may in general result in a collapsed graph that is only 
loosely corresponds, say by a conservative or liberal abstraction, to the original 
structure; but (b) which progresses toward exact abstraction as the system ap- 
proaches genuine symmetry. Our previous work with virtual general symmetry 
suggests a measure based on the number of “missing” arcs. Another technical 
problem that would have significant practical ramifications would be to extend 
approximate symmetry to annotated quotient structures (cf. [15], [16]). 
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2.2 Parameterized Verification 

Symmetry quotient reductions, and many other abstraction methods, address 
reasoning about systems with k processes for a possibly large natural number 
constant k. It is often more desirable to reason about systems comprised of 
n homogeneous processes, where n is a natural number parameter. This gives 
rise to the Parameterized Model Checking Problem (PMCP): decide whether a 
temporal property is true for all (sufficiently large) size instances n of a given 
system. The advantage is that one application of PMCP settles all cases of 
interest; there is then no need to be concerned whether a system known to be 
correct for 100 processes might fail for 101 processes or 1000 processes. In general, 
PMCP is undecidable (cf. [AK86]). But because of its practical importance many 
researchers have addressed this problem, obtaining interesting results and partial 
solutions (cf., e.g., [23], [2], [18], [22]). But most of these have potentially serious 
limitations, e.g., they require human assistance (“process invariants”, etc.), are 
only partially automated (may not terminate), are sound but not complete, or 
lack a well-defined domain of applicability. 

Parameterized Synchronous Systems. In [12] we formulated an algorithm for 
determining if a parameterized synchronous system satisfies a specification. The 
method is based on forming a single finite abstract graph (cf. [Lu84]) which en- 
codes the behavior of concrete systems of all sizes n. An abstract state s records 
for a state s, which process locations are occupied. The theory we developed in 
[12] we later applied in [13] to verify correctness of the Society of Automotive 
Engineers SAE-J1850 protocol. This protocol operates along a single wire bus 
in (Ford) automobiles, and coordinates the interactions of (Motorola) micro- 
controllers distributed among the brake units, the airbags, the engine, etc. The 
general goal is to verify parameterized correctness so that the bus operates cor- 
rectly no matter how many units are installed on the bus. (Thus, the same bus 
architecture could be used in small cars and large trucks.) The specific property 
we verified, by using a meta-tool built on-top of SMV, was that higher priority 
messages could not be overtaken by lower priority messages. 

Open problem related to parameterized synchronous systems. A restriction on 
the mathematical model used in [12] is that conventional mutual exclusion al- 
gorithms cannot be implemented in it (cf. [18]), roughly because a system with 
1 processes in a local state is, by definition of the abstract graph, indistinguish- 
able from one with strictly more than 1 processes in that local state. It would 
be desirable to overcome this limitation. One possibility is to refine the abstract 
graph to distinguish between exactly 1 vs. 2 vs. 3 or more processes in an ab- 
stract state, to accommodate mutual exclusion. Or perhaps this method could 
be adjoined with use of a signal token as in [11], possession of which arbitrates 
among competing processes. 

Reducing Many Processes to Few. In recent work [EKOO] we give a fully au- 
tomatic (algorithmic), sound and complete solution to PMCP in a rather broad 
framework. We consider asynchronous systems comprised of many homogeneous 
copies of a generic process template. The process template is represented as 
a synchronization skeleton [8], where enabling guards have a special charac- 
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ter, either disjunctive or conjunctive. Correctness properties are expressed using 
CTL*\X (CTL* minus the nexttime X) We reduce model checking for systems 
of arbitrary size n to model checking for systems of size (up to) a small cutoff size 
c. This establishes decidability of PMCP as it is only necessary to model check 
a finite number of relatively small systems. In a number of interesting cases, 
we can establish polynomial time decidability. For example, we can reduce the 
problem of checking mutual exclusion for a critical section protocol to reasoning 
about systems with 2 processes. We emphasize this algorithmically establishes 
correctness for systems of all sizes n. This method generalizes and has been ap- 
plied to systems comprised of multiple heterogeneous classes of processes, e.g., 
m readers and n writers. 



3 Theory and Practice in Model Checking 

In this section we discuss factors related to the usefulness of and use of model 
checking. The core theoretical principles underlying model checking rest on a 
few basic ideas including, e.g. some from modal logic (the finite model theorem), 
lattice theory (the Tarski-Knaster theorem permitting branching time modal- 
ities to be calculated easily), and automata theory (the language containment 
approach). These ideas are well-known. It has also become increasingly clear that 
model checking is useful in practice. Model checking is employed by most major 
hardware companies, e.g.. Cadence, IBM, Intel, and Motorola, for verification 
and debugging of microprocessors circuits. Model checking is showing promise 
for the verification of software, and is being used or is under investigation by, 
e.g.. Lucent, Microsoft, NASA. We would now like to offer an account of why it 
is that model checking turns out to be useful in practice. It involves the following 
factors. 

Model checking = search A efficiency A expressiveness. Plainly, efficiency is 
key. Impressive progress has been made with symbolic data structures such as 
BDDs, and significant advances with various forms of abstraction are ongoing. 
But expressiveness is also key. An efficient automated verification method that 
could not express most of the important correctness properties of interest would 
be of little use, as would a method that lacked the modeling power to capture 
reactive systems of interest. It seems model checking fares well on these fronts. 

Temporal logic provides a powerful and flexible language for specification. 
Overall, it is quite adequate to the task of reasoning about reactive systems. 
This is perhaps a bit surprising when we recall that we are using propositional 
temporal logic. But the work of Kamp [20], showing that LTL is essentially 
equivalent to the First Order Language of Linear Order, does provide a measure 
of expressive completeness. It is certainly the case that there are properties such 
as “P holds at every even moment, and we don’t care about the odd moments” 
which are not expressible in LTL. However, they are expressible in the frame- 
work of w-fsa’s which means that our technical model checking machinery is 
still applicable. The author, in fact, views automata as just generalized formu- 
lae of temporal logic. Automata and temporal logic, broadly speaking, are the 
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same thing. The advantage of temporal logic, which is sometimes important in 
practice, is the close connection with tensed natural language. 

A finite state framework suffices. At least it does in practice for most of the 
applications most of the time. This is genuinely surprising, but, there are several 
reasons for it. First, from the standpoint of specifications, most all propositional 
modal and temporal logics, including LTL and CTL, have the (bounded) finite 
model property. If a formula / in such a logic is satisfiable in any model, then it 
is satisfiable in a finite model (of size bounded by some function of the length of 
/, e.g., exp{ I/I )). If a system can be specified in propositional temporal logic, 
it can be realized by a finite state program. Two decades of experience shows 
that many (of at least the crucial parts of) reactive systems can be specified 
in propositional temporal logic, and hence should be realized by a finite state 
program. Second, from the standpoint of the reactive programs themselves, we 
see that most solutions to synchronization problems presented in the literature 
are, indeed, finite state. Typically, in a concurrent program, we can cleanly sepa- 
rate out the finite state synchronization skeletons [7] of the individual processes, 
which synchronize and coordinate the processes, from the sequential code. For 
example, in the synchronization skeleton for the solution to the mutual exclusion 
problem, we abstract out the details of the code manipulating the critical sec- 
tion and obtain just a single node. In the field we are starting to see a trend in 
model checking of software of abstracting out the irrelevant sequential parts and 
boiling down the program to a system of finite state synchronization skeletons. 

Model checking is highly automated. It is fully automated in principle and 
highly automated in practice. Human intervention is sometimes required in 
practice for such things as determining a good variable ordering when doing 
BDD-based symbolic model checking. It can also be required in doing certain 
abstractions to get the model to be checked. For instance, in the case where 
one is verifying a module in isolation, typically a human must understand the 
environment in which the module operates and provide an abstraction of the 
environment for the module to interact with. Still, model checking is highly au- 
tomated even in practice. Model checking seems to be more popular in industrial 
usage than theorem proving because of the high degree of automation. Theorem 
provers, while having in principle unbounded capacity, require human expertise 
to supply key lemmas, and, a skilled “operator” of the tool. In practice, the op- 
erator is usually someone with a Ph.D. in CS, EE, or, quite often. Mathematics. 
For this reason, deployment of theorem provers may be hindered in an indus- 
trial setting. Due to the automation of model checkers, they can successfully be 
used by engineers and programmers at the M.S. or B.S. level. In management’s 
view, this facilitates the wide-scale deployment of verification technology in the 
organization. 



4 Conclusion 

There is nowadays widespread interest in Computer Aided Verification of reac- 
tive systems, as evidenced by the attention paid to the topics at FST-TCS, as 
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well as such conferences as CAV, TACAS, CONCUR, FMCAD, etc. The reason 
is that such automated techniques as model checking as well as partially auto- 
mated theorem proving have by now been shown to actually work on a variety 
of “industrial strength” examples. Pnueli [28] argues that, due to the success of 
techniques such as model checking on actual applications, we are on the verge of 
an era of Verification Engineering. Of course, there is still a gulf between what 
we need to do and what we currently have the capacity to do. Basic advances 
as well as concerted engineering efforts are called for. One popular, and valu- 
able, idea is the integration of theorem provers with model checkers; a number 
of researchers are pursuing this topic. For the present, however, this researcher’s 
primary interest is still to try to push the idea of model-theoretic automation as 
far as possible, aspiring to Completely Automated Verification. 
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Abstract. This document proposes an algebra for XML Query. The 
algebra has been submitted to the W3C XML Query Working Group. A 
novel feature of the algebra is the use of regular-expression types, similar 
in power to DTDs or XML Schemas, and closely related to Hasoya, 
Pierce, and Vouillon’s work on Xduce. The iteration construct involves 
novel typing rules not encountered elsewhere (even in Xduce). 



1 Introduction 

This document proposes an algebra for XML Query. 

This work builds on long standing traditions in the database community. In 
particular, we have been inspired by systems such as SQL, OQL, and nested 
relational algebra (NRA). We have also been inspired by systems such as Quilt, 
UnQL, XDuce, XML-QL, XPath, XQL, and YATL. We give citations for all 
these systems below. 

In the database world, it is common to translate a query language into an 
algebra; this happens in SQL, OQL, and NRA, among others. The purpose of 
the algebra is twofold. First, the algebra is used to give a semantics for the query 
language, so the operations of the algebra should be well-defined. Second, the 
algebra is used to support query optimization, so the algebra should possess a 
rich set of laws. Our algebra is powerful enough to capture the semantics of 
many XML query languages, and the laws we give include analogues of most of 
the laws of relational algebra. 

In the database world, it is common for a query language to exploit schemas 
or types; this happens in SQL, OQL, and NRA, among others. The purpose of 
types is twofold. Types can be used to detect certain kinds of errors at compile 
time and to support query optimization. DTDs and XML Schema can be thought 
of as providing something like types for XML. Our algebra uses a simple type 
system that captures the essence of XML Schema [35] . The type system is close 
to that used in XDuce [19]. Our type system can detect common type errors and 
support optimization. A novel aspect of the type system (not found in Xduce) 
is the description of projection in terms of iteration, and the typing rules for 
iteration that make this viable. 

The best way to learn any language is to use it. To better familiarize readers 
with the algebra, we have implemented a type checker and an interpreter for the 
algebra in OCaml[24]. A demonstration version of the system is available at 



S. Kapoor and S. Prasad (Eds.): FST TCS2000, LNCS 1974, pp. 11—45, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




12 



Mary Fernandez, Jerome Simeon, and Philip Wadler 



http: //www. cs .bell-labs . com/ ~wadler/topics/xml ,html#xalgebra 

The demo system allows you to type in your own queries to be type checked and 
evaluated. All the examples in this paper can be executed by the demo system. 

This paper describes the key features of the algebra. For simplicity, we restrict 
our attention to only three scalar types (strings, integers, and booleans), but we 
believe the system will smoothly extend to cover the continuum of scalar types 
found in XML Schema. Other important features that we do not tackle include 
attributes, namespaces, element identity, collation, and key constraints, among 
others. Again, we believe they can be added within the framework given here. 

The paper is organized as follows. A tutorial introduction is presented in 
Section 2. Section 3 explains key aspects of projection and iteration. A summary 
of the algebra’s operators and type system is given in Section 4. We present some 
equivalence and optimization laws of the algebra in Section 5. Finally, we give 
the static typing rules for the algebra in Section 6. Section 7 discusses open 
issues and problems. 

Cited literature includes: SQL [16], OQL [4,5,13], NRA [8,15,21,22], 
Quilt [11], UnQL [3], XDuce [19], XML Query [33,34], XML Schema [35,36], 
XML-QL [17], XPath [32], XQL [25], and YaTL [14]. 

2 The Algebra by Example 

This section introduces the main features of the algebra, using familiar examples 
based on accessing a database of books. 

2.1 Data and Types 

Consider the following sample data: 

<bib> 

<book> 

<title>Data on the Web</title> 

<year>1999</year> 

<author>Abiteboul</ author> 

<author>Buneman</ author> 

<author>Suciu</ author> 

</book> 

<book> 

<title>XML Query</title> 

<year>200K/year> 

<author>Fernandez</ author> 

<author>Suciu</ author> 

</book> 

</bib> 

Here is a fragment of a XML Schema for such data. 
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<xsd:group name="Bib"> 

<xsd: element ncune="bib"> 

<xsd : complexType> 

<xsd: group ref="Book" 

min0ccurs="0" maxDccurs= "unbounded" /> 

</xsd: complexType> 

</xsd: element> 

</xsd:group> 

<xsd:group name="Book"> 

<xsd: element ncune="book"> 

<xsd : complexType> 

<xsd: element name="title" type="xsd: string"/> 
<xsd: element name="year" type="xsd: integer"/> 
<xsd: element name="author" type="xsd: integer" 
minOccurs=" 1" maxDccurs= "unbounded" /> 

</xsd: complexType> 

</xsd: element> 

</xsd:group> 

This data and schema is represented in our algebra as follows: 

type Bib = 

bib [ Book* ] 
type Book = 
book [ 

title [ String ] , 
year [ Integer ] , 
author [ String ] + 

] 

let bibO : Bib = 
bib [ 
book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

], 

book [ 

title [ "XML Query" ] , 

year [ 2001 ] , 

author [ "Fernandez" ] , 

author [ "Suciu" ] 

] 

] 
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The expression above defines two types. Bib and Book, and defines one global 
variable, bibO. 

The Bib type consists of a bib element containing zero or more value of type 
Book. The Book type consists of a book element containing a title element 
(which contains a string), a year element (which contains an integer), and one 
or more author elements (which contain strings). 

The Bib type corresponds to a single bib element, which contains a forest 
of zero or more Book elements. We use the term forest to refer to a sequence of 
(zero or more) elements. Every element can be viewed as a forest of length one. 

The Book type corresponds to a single book element, which contains one 
title element, followed by one year element, followed by one or more author 
elements. A title or author element contains a string value and a year element 
contains an integer. 

The variable bibO is bound to a literal XML value, which is the data model 
representation of the earlier XML document. The bib element contains two book 
elements. 

The algebra is a strongly typed language, therefore the value of bibO must 
be an instance of its declared type, or the expression is ill-typed. Here the value 
of bibO is an instance of the Bib type, because it contains one bib element, 
which contains two book elements, each of which contain a string-valued title, 
an integer-valued year, and one or more string-valued author elements. 

For convenience, we define a second global variable bookO, also bound to a 
literal value, which is equivalent to the first book in bibO. 

let bookO : Book = 
book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

] 

2.2 Projection 

The simplest operation is projection. The algebra uses a notation similar in 
appearance and meaning to path navigation in XPath. 

The following expression returns all author elements contained in bookO: 

bookO/author 

==> author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

: author [ String ] + 

The above example and the ones that follow have three parts. First is an expres- 
sion in the algebra. Second, following the ==>, is the value of this expression. 
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Third, following the : , is the type of the expression, which is (of course) also a 
legal type for the value. 

The following expression returns all author elements contained in book ele- 
ments contained in bibO: 

bibO/book/ author 
==> author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] , 
author [ "Fernandez" ] , 
author [ "Suciu" ] 

: author [ String ] * 

Note that in the result, the document order of author elements is preserved and 
that duplicate elements are also preserved. 

It may be unclear why the type of bibO/book/author contains zero or more 
authors, even though the type of a book element contains one or more authors. 
Let’s look at the derivation of the result type by looking at the type of each 
sub-expression: 

bibO : Bib 

bibO/book : Book* 

bibO/book/author : author [ String ]* 

Recall that Bib, the type of bibO, may contain zero or more Book elements, 

therefore the expression bibO/book might contain zero book elements, in which 
case, bibO/book/author would contain no authors. 

This illustrates an important feature of the type system: the type of an 
expression depends only on the type of its sub-expressions. It also illustrates 
the difference between an expression’s run-time value and its compile-time type. 
Since the type of bibO is Bib, the best type for bibO/book/author is one listing 
zero or more authors, even though for the given value of bibO the expression 
will always contain exactly five authors. 

2.3 Iteration 

Another common operation is to iterate over elements in a document so that 
their content can be transformed into new content. Here is an example of how 
to process each book to list the authors before the title, and remove the year. 

for b in bibO/book do 

book [ b/author, b/title ] 

==> book [ 

author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] , 
title [ "Data on the Web" ] 
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book [ 

author [ "Fernandez" ] , 
author [ "Suciu" ] , 
title [ "XML Query" ] 

] 

book [ 

author [ String ]+, 
title [ String ] 

]* 



The for expression iterates over all book elements in bibO and binds the vari- 
able b to each such element. For each element bound to b, the inner expression 
constructs a new book element containing the book’s authors followed by its 
title. The transformed elements appear in the same order as they occur in bibO. 

In the result type, a book element is guaranteed to contain one or more 
authors followed by one title. Let’s look at the derivation of the result type to 
see why: 



bibO/book 

b 

b/author 

b/title 



Book* 

Book 

author [ String ] + 
title [ String ] 



The type system can determine that b is always Book, therefore the type of 
b/author is author [ String ]+ and the type of b/title is title [ String ]. 

In general, the value of a for loop is a forest. If the body of the loop itself 
yields a forest, then all of the forests are concatenated together. For instance, 
the expression: 



for b in bibO/book do 
b/author 



is exactly equivalent to the expression bibO/book/author. 

Here we have explained the typing of for loops by example. In fact, the 
typing rules are rather subtle, and one of the more interesting aspects of the 
algebra, and will be explained further below. 



2.4 Selection 

Projection and for loops can serve as the basis for many interesting queries. The 
next three sections show how they provide the power for selection, quantification, 
join, and regrouping. 

To select values that satisfy some predicate, we use the where expression. 
For example, the following expression selects all book elements in bibO that 
were published before 2000. 
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for b in bibO/book do 

where value (b/year) <= 2000 do 
b 

==> book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

] 

: Book* 

The value operator returns the scalar (i.e., string, integer, or boolean) content 
of an element. 

An expression of the form 

where e\ do C 2 
is just syntactic sugar for 
if Cl then C 2 else () 

where e\ and are expressions. Here 0 is an expression that stands for the 
empty sequence, a forest that contains no elements. We also write 0 for the 
type of the empty sequence. 

According to this rule, the expression above translates to 

for b <- bibO/book in 

if value (b/year) < 2000 then b else () 

and this has the same value and the same type as the preceding expression. 

2.5 Quantification 

The following expression selects all book elements in bibO that have some author 
named “Buneman” . 

for b in bibO/book do 
for a in b/author do 

where value (a) = "Buneman" do 
b 

==> book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

] 

; Book* 
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In contrast, we can use the empty operator to find all books that have no 
author whose name is Buneman: 



for b in bibO/book do 

where empty (for a in b/author do 

where value (a) = "Buneman" do 
a) do 



b 



==> book [ 

title [ "XML Query" ] , 
year [ 2001 ] , 
author [ "Fernandez" ] , 
author [ "Suciu" ] 

] 

; Book* 



The empty expression checks that its argument is the empty sequence (). 

We can also use the empty operator to find all books where all the authors 
are Buneman, by checking that there are no authors that are not Buneman: 



for b in bibO/book do 

where empty (for a in b/author do 

where value (a) <> "Buneman" do 
a) do 



b 



==> 0 



Book* 



There are no such books, so the result is the empty sequence. Appropriate use 
of empty (possibly combined with not) can express universally or existentially 
quantified expressions. 

Here is a good place to introduce the let expression, which binds a local 
variable to a value. Introducing local variables may improve readability. For 
example, the following expression is exactly equivalent to the previous one. 

for b in bibO/book do 

let nonbunemans = (for a in b/author do 

where value (a) <> "Buneman" do 
a) do 

where empty (nonbunemans) do 
b 



Local variables can also be used to avoid repetition when the same subexpression 
appears more than once in a query. 



2.6 Join 

Another common operation is to join values from one or more documents. To 
illustrate joins, we give a second data source that defines book reviews: 
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type Reviews = 
reviews [ 
book [ 

title [ String ] , 
review [ String ] 

]* 

] 

let reviewO : Reviews = 
reviews [ 
book [ 

title [ "XML Query" ] , 
review [ "A darn fine book." ] 

], 

book [ 

title [ "Data on the Web" ] , 
review [ "This is great!" ] 

] 

] 

The Reviews type contains one reviews element, which contains zero or more 
book elements; each book contains a title and review. 

We can use nested for loops to join the two sources reviewO and bibO on 
title values. The result combines the title, authors, and reviews for each book. 

for b in bibO/book do 

for r in reviewO/book do 

where value (b/title) = value (r/title) do 
book [ b/title, b/author, r/review ] 

book [ 

title [ "Data on the Web" ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 
review [ "A darn fine book." ] 

], 

book [ 

title [ "XML Query" ] , 
author [ "Fernandez" ] , 
author [ "Suciu" ] 
review [ "This is great!" ] 

] 

; book [ 

title [ String ] , 
author [ String ] + 
review [ String ] 

]* 
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Note that the outer-most for expression determines the order of the result. 
Readers familiar with optimization of relational join queries know that relational 
joins commute, i.e., they can be evaluated in any order. This is not true for the 
XML algebra: changing the order of the first two for expressions would pro- 
duce different output. In Section 7, we discuss extending the algebra to support 
unordered forests, which would permit commutable joins. 

2.7 Restructuring 

Often it is useful to regroup elements in an XML document. For example, each 
book element in bibO groups one title with multiple authors. This expression 
regroups each author with the titles of his/her publications. 

for a in distinct (bibO/book/author) do 
biblio [ 
a, 

for b in bibO/book do 
for a2 in b/author do 

where value (a) = value (a2) do 
b/title 

] 

==> biblio [ 

author [ "Abiteboul" ] , 
title [ "Data on the Web" ] 

], 

biblio [ 

author [ "Buneman" ] , 
title [ "Data on the Web" ] 

], 

biblio [ 

author [ "Suciu" ] , 

title [ "Data on the Web" ] , 

title [ "XML Query" ] 

], 

biblio [ 

author [ "Fernandez" ] , 
title [ "XML Query" ] 

] 

: biblio [ 

author [ String ] , 
title [ String ]* 

]* 



Readers may recognize this expression as a self-join of books on authors. The 
expression distinct (bibO/book/author) produces a forest of author elements 
with no duplicates. The outer for expression binds a to each author element. 
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and the inner for expression selects the title of each book that has some author 
equal to a. 

Here distinct is an example of a built-in function. It takes a forest of ele- 
ments and removes duplicates. 

The type of the result expression may seem surprising: each biblio element 
may contain zero or more title elements, even though in bibO, every author 
co-occurs with a title. Recognizing such a constraint is outside the scope of 
the type system, so the resulting type is not as precise as we might like. 

2.8 Aggregation 

We have already seen several several built-in functions, such as children, 
distinct, and value. In addition to these, the algebra has five built-in ag- 
gregation functions: avg, count, max, min and sum. 

This expression selects books that have more than two authors: 

for b in bibO/book do 

where count (b/ author) > 2 do 
b 

==> book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

] 

: Book* 

All the aggregation functions take a forest with repetition type and return an 
integer value; count returns the number of elements in the forest. 



2.9 Functions 



Functions can make queries more modular and concise. Recall that we used the 
following query to find all books that do not have “Buneman” as an author. 



for b in bibO/book do 

where empty (for a in b/author do 

where value (a) = "Buneman" do 
a) do 



b 



==> book [ 



title 


[ 


"XML Query" 


] 


year 


[ 


2001 ] , 




author 


[ 


"Fernandez" 


] 


author 


[ 


"Suciu" ] 





Book* 




22 



Mary Fernandez, Jerome Simeon, and Philip Wadler 



A different way to formulate this query is to first define a function that takes a 
string s and a book b as arguments, and returns true if book b does not have 
an author with name s. 

fun notauthor (s : String; b : Book) : Boolean = 
empty (for a in b/author do 

where value (a) = s do 
a) 

The query can then be re-expressed as follows. 

for b in bibO/book do 

where notauthor ("Buneman" ; b) do 
b 

==> book [ 



title 


[ 


"XML Query" 


] 


year 


[ 


2001 ] , 




author 


[ 


"Fernandez" 


] 


author 


[ 


"Suciu" ] 





] 

Book* 



We use semicolon rather than comma to separate function arguments, since 
comma is used to concatenate forests. 

Note that a function declaration includes the types of all its arguments and 
the type of its result. This is necessary for the type system to guarantee that 
applications of functions are type correct. 

In general, any number of functions may be declared at the top-level. The 
order of function declarations does not matter, and each function may refer to 
any other function. Among other things, this allows functions to be recursive 
(or mutually recursive), which supports structural recursion, the subject of the 
next section. 

2.10 Structural Recursion 

XML documents can be recursive in structure, for example, it is possible to define 
a part element that directly or indirectly contains other part elements. In the 
algebra, we use recursive types to define documents with a recursive structure, 
and we use recursive functions to process such documents. (We can also use 
mutual recursion for more complex recursive structures.) 

For instance, here is a recursive type defining a part hierarchy. 

type Part = 

Basic I Composite 
type Basic = 
basic [ 

cost [ Integer ] 

] 
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type Composite = 
composite [ 

assembly_cost [ Integer ] , 
subparts [ Part+ ] 

] 

And here is some sample data. 

let partO : Part = 
composite [ 

assembly_cost [ 12 ] , 
subparts [ 
composite [ 

assembly_cost [ 22 ] , 
subparts [ 

basic [ cost [ 33 ] ] 

] 

], 

basic [ cost [ 7 ] ] 

] 

] 

Here vertical bar ( I ) is used to indicate a choice between types: each part is either 
basic (no subparts), and has a cost, or is composite, and includes an assembly 
cost and subparts. 

We might want to translate to a second form, where every part has a total 
cost and a list of subparts (for a basic part, the list of subparts is empty). 

type Part2 = 
part [ 

total_cost [ Integer ] , 
subparts [ Part2* ] 

] 

Here is a recursive function that performs the desired transformation. It uses 
a new construct, the case expression. 

fun convert (p : Part) : Part2 = 
case p of 

b : basic => 
part [ 

total_cost [ value (b/cost) ], 
subparts [] 

] 

I c : composite => 

let s = (for q in children(c/subparts) do convert (q)) in 
part [ 

total_cost [ 
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value (c/assembly_cost) + 

sumCfor t in s/total_cost do value (t)) 

], 

subparts [ s ] 

] 

end 

Each branch of the case is labeled with an element name, basic or composite, 
and with a corresponding variable, b or c. The case expression checks whether 
the value of p is a basic or composite element, and evaluates the corresponding 
branch. If the first branch is taken then b is bound to the value of p, and the 
branch retuns a new part with total cost the same as the cost of b, and with no 
subparts. If the second branch is taken then c is bound to the value of p. The 
function is recursively applied to each of the subparts of c, giving a list of new 
subparts s. The branch returns a new part with total cost computed by adding 
the assembly cost of c to the sum of the total cost of each subpart in s, and 
with subparts s. 

One might wonder why b and c are required, since they have the same value 
as p. The reason why is that p, b, and c have different types. 

p : Part 
b : Basic 
c ; Composite 

The types of b and c are more precise than the type of p, because which branch 
is taken depends upon the type of value in p. 

Applying the query to the given data gives the following result. 

convert (partO) 

==> part [ 

total_cost [ 74 ] , 
subparts [ 
part [ 

total_cost [ 55 ] , 
subparts [ 
part [ 

total_cost [ 33 ] , 
subparts [] 

] 

] 

], 

part [ 

total_cost [ 7 ] , 
subparts [] 

] 

] 

] 

: Part2 
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Of course, a case expression may be used in any query, not just in a recursive 
one. 

2.11 Processing Any Well-Formed Document 

Recursive types allow us to define a type that matches any well-formed XML 
document. This type is called UrTree: 

type UrTree = 

UrScalar 
I ~ [ UrTree* ] 

Here UrScalar is a built-in scalar type. It stands for the most general scalar 
type, and all other scalar types (like Integer or String) are subtypes of it. The 
tilde (~) is used to indicate a wild-card type. In general, ~ [t] indicates the type 
of elements that may have any tag, but must have children of type t. So an 
UrTree is either an UrScalar or a wildcard element with zero or more children, 
each of which is itself an UrTree. In other words, any single element or scalar 
has type UrTree. 

The use of UrScalar is a small, but necessary, extension to XML Schema, 
since XML Schema provides no most general scalar type. In contrast, the use of 
tilde is a significant extension to XML Schema, because XML Schema has no 
type corresponding to ~ [t] , where t is some type other than UrTree*. It is not 
clear that this extension is necessary, since the more restrictive expressiveness of 
XML Schema wildcards may be adequate. Also, note that UrTree* is equivalent 
to the UrType in XML Schema. 

In particular, our earlier data also has type UrTree. 

bookO : UrTree 
==> book [ 

title [ "Data on the Web" ] , 
year [ 1999 ] , 
author [ "Abiteboul" ] , 
author [ "Buneman" ] , 
author [ "Suciu" ] 

] 

: UrTree 

A specific type can be indicated for any expression in the query language, by 
writing a colon and the type after the expression. 

As an example, we define a recursive function that converts any XML data 
into HTML. We first give a simplified definition of HTML. 

type HTML = 

( UrScalar 
I b [ HTML ] 

I ul [ (li [ HTML ])* ] 

)* 
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An HTML body consists of a sequence of zero or more items, each of which is 
either: a scalar; or a b element (boldface) with HTML content; or a ul element 
(unordered list), where the children are li elements (list item), each of which 
has HTML content. 

Now, here is the function that performs the conversion. 

fun html_of _xml ( t : UrTree ) : HTML = 
case t of 

s : UrScalar => 
s 

I e => 

b [ ncune (e) ] , 

ul [ for c in children(e) do li [ html_of _xml (c) ] ] 

end 

The case expression checks whether the value of x is a subtype of UrScalar or 
otherwise, and evaluates the corresponding branch. If the first branch is taken, 
then s is bound to the value of t, which must be a scalar, and the branch returns 
the scalar. If the second branch is taken, then e is bound to the value of t, which 
must not be a scalar, and hence must be an element. The branch returns the 
name of the element in boldface, followed by a list containing one item for each 
child of the element. The function is recursively applied to get the content of 
each list item. 

Applying the query to the book element above gives the following result. 

html_of _xml (bookO) 

==> b [ "book" ] , 
ul [ 



li 


[ 


b 


[ 


"title" ] 


> 


ul 


[ 


li 


[ 


"Data on the 


Web" ] 


li 


[ 


b 


[ 


"year" ] , 




ul 


[ 


li 


[ 


1999 ] ] ] , 






li 


[ 


b 


[ 


"author" 


], 


ul 


[ 


li 


[ 


"Abiteboul" 


] 


] ], 


li 


[ 


b 


[ 


"author" 


], 


ul 


[ 


li 


[ 


"Buneman" ] 


] 


], 


li 


[ 


b 


[ 


"author" 


], 


ul 


[ 


li 


[ 


"Suciu" ] ] 


] 





] 

: Html_Body 

2.12 Top-Level Queries 

A query consists of a sequence of top-level expressions, or query items, where each 
query item is either a type declaration, a function declaration, a global variable 
declaration, or a query expression. The order of query items is immaterial; all 
type, function, and global variable declarations may be mutually recursive. 

A query can be evaluated by the query interpreter. Each query expression 
is evaluated in the environment specified by all of the declarations. (Typically, 
all of the declarations will precede all of the query expressions, but this is not 
required.) We have already seen examples of type, function, and global variable 
declarations. An example of a query expression is: 
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query html_of _xml (bookO) 

To transform any expression into a top-level query, we simply precede the ex- 
pression by the query keyword. 

3 Projection and Iteration 

This section describes key aspects of projection and iteration. 

3.1 Relating Projection to Iteration 

The previous examples use the / operator liberally, but in fact we use / as 
a convenient abbreviation for expressions built from lower-level operators: for 
expressions, the children function, and case expressions. 

For example, the expression: 

bookO/author 

is equivalent to the expression: 

for c in children (bookO) do 
case c of 

a : author => a 
I b => 0 
end 

Here the children function returns a forest consisting of the children of the 
element bookO, namely, a title element, a year element, and three author elements 
(the order is preserved). The for expression binds the variable v successively to 
each of these elements. Then the case expression selects a branch based on the 
value of V. If it is an author element then the first branch is evaluated, otherwise 
the second branch. If the first branch is evaluated, the variable a is bound to the 
same value as x, then the branch returns the value of a. If the second branch 
is evaluated, the variable b is bound to the same value as x, then then branch 
returns (), the empty sequence. 

To compose several expressions using /, we again use for expressions. For 
example, the expression: 

bibO/book/ author 

is equivalent to the expression: 

for c in children(bibO) do 
case c of 
b : book => 

for d in children(b) do 
case d of 



a : author => d 
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I e => 0 
end 

I f => 0 

end 

The for expression iterates over all book elements in bibO and binds the variable 
b to each such element. For each element bound to b, the inner expression returns 
all the author elements in b, and the resulting forests are concatenated together 
in order. 

In general, an expression of the form e / a is converted to the form 

for vi in e do 

for V2 in children(ui) do 
case V2 of 
Vs : a => Vs 
I Vi => 0 

end 

where e is an expression, a is a tag, and v\, V2, vs, Vi are fresh variables (ones 
that do not appear in the expression being converted). 

According to this rule, the expression bibO/book translates to 

for vl in bibO do 

for v2 in children(vl) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end 

In Section 5 we introduce laws of the algebra, which allow us to simplify this to 
the previous expression 

for v2 in children(bibO) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end 

Similarly, the expression bibO/book/author translates to 

for v5 in (for v2 in children(bibO) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end) do 

for v6 in children(v5) do 
case v6 of 

v7 : author => v7 
I v8 => 0 
end 
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Again, the laws will allow us to simplify this to the previous expression 

for v2 in children(bibO) do 
case v2 of 
v3 : book => 

for v6 in children(v3) do 
case c of 

v7 : author => d 
I v8 => 0 
end 

I v4 => 0 
end 

These examples illustrate an important feature of the algebra: high-level opera- 
tors may be defined in terms of low-level operators, and the low-level operators 
may be subject to algebraic laws that can be used to further simplify the ex- 
pression. 



3.2 Typing Iteration 

The typing of for loops is rather subtle. We give an intuitive explanation here, 
and cover the detailed typing rules in Section 6. 

A unit type is either an element type a [t] , a wildcard type ~ [t] , or a scalar 
type s. A for loop 

for V in e\ do 

is typed as follows. First, one finds the type of expression e\. Next, for each unit 
type in this type one assumes the variable v has the unit type and one types 
the body 62- Note that this means we may type the body of 62 several times, 
once for each unit type in the type of ei. Finally, the types of the body 62 are 
combined, according to how the types were combined in ei. That is, if the type 
of Cl is formed with sequencing, then sequencing is used to combine the types 
of 62, and similarly for choice or repetition. 

For example, consider the following expression, which selects all author ele- 
ments from a book. 

for c in children (bookO) do 
case c of 

a : author => a 
I b => 0 
end 

The type of children (bookO) is 

title [String] , year [Integer] , author [String] + 
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This is composed of three unit types, and so the body is typed three times. 

assuming c has type title [String] the body has type 0 

” year [Integer] ” () 

” author [String] ” author [String] 

The three result types are then combined in the same way the original unit types 
were, using sequencing and iteration. This yields 

(), 0 , author [String] + 

as the type of the iteration, and simplifying yields 

author [String] + 

as the final type. 

As a second example, consider the following expression, which selects all 
title and author elements from a book, and renames them. 

for c in children (bookO) do 
case c of 

t ; title => titl [ value (t) ] 

I y ; year => () 

I a : author => auth [ value (a) ] 

end 

Again, the type of children (bookO) is 

title [String] , year [Integer] , author [String] + 

This is composed of three unit types, and so the body is typed three times. 

assuming c has type title [String] the body has type titl [String] 

” year [Integer] ” () 

” author [String] ” auth [String] 

The three result types are then combined in the same way the original unit types 
were, using sequencing and iteration. This yields 

titl [String] , () , auth [String] + 

as the type of the iteration, and simplifying yields 

titl [String] , auth [String] + 

as the final type. Note that the title occurs just once and the author occurs one 
or more times, as one would expect. 

As a third example, consider the following expression, which selects all basic 
parts from a sequence of parts. 
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for p in children (partO/subparts) do 
case p of 

b ; basic => b 

I c ; composite => () 

end 

The type of children (partO/subparts) is 
(Basic I Composite) + 

This is composed of two unit types, and so the body is typed two times. 

assuming p has type Basic the body has type Basic 
” Composite ” () 

The two result types are then combined in the same way the original unit types 
were, using sequencing and iteration. This yields 

(Basic I ())+ 

as the type of the iteration, and simplifying yields 
Basic* 

as the final type. Note that although the original type involves repetition one 
or more times, the final result is a repetition zero or more times. This is what 
one would expect, since if all the parts are composite the final result will be an 
empty sequence. 

In this way, we see that for loops can be combined with case expressions 
to select and rename elements from a sequence, and that the result is given a 
sensible type. 

In order for this approach to typing to be sensible, it is necessary that the unit 
types can be uniquely identified. However, the type system given here satisfies 
the following law. 

a[ti I ^2] = a Hi] I a[t2] 

This has one unit type on the left, but two distinct unit types on the right, and so 
might cause trouble. Fortunately, our type system inherits an additional restric- 
tion from XML Schema: we insist that the regular expressions can be recognized 
by a top-down deterministic automaton. In that case, the regular expression 
must have the form on the left, the form on the right is outlawed because it 
requires a non-deterministic recognizer. With this additional restriction, there is 
no problem. 



4 Summary of the Algebra 

In this section, we summarize the algebra and present the grammars for expres- 
sions and types. 
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4.1 Expressions 

Figure 1 contains the grammar for the algebra, i.e., the convenient concrete 
syntax in which a user may write a query. A few of these expressions can be 
rewritten as other expressions in a smaller core algebra; such reducible expres- 
sions are labeled with We define the algebra’s typing rules on the smaller 
core algebra. In Section 5, we give the laws that relate a user expression with its 
equivalent expression in the core algebra. Typing rules for the core algebra are 
defined in Section 6. 

We have seen examples of most of the expressions, so we will only point out 
details here. We define a subset of expressions that correspond to data values. 
An expression is a data value if it consists only of scalar constant, element, 
sequence, and empty sequence expressions. 

We have not defined the semantics of the binary operators in the algebra. It 
might be useful to define more than one type of equality over scalar and element 
values. We leave that to future work. 

4.2 Types 

Figure 2 contains the grammar for the algebra’s type system. We have already 
seen many examples of types. Here, we point out some details. 

Our algebra uses a simple type system that captures the essence of XML 
Schema [35]. The type system is close to that used in XDuce [19]. 

In the type system of Figure 2, a scalar type may be a UrScalar, Boolean, 
Integer, or String. In XML Schema, a scalar type is defined by one of fourteen 
primitive datatypes and a list of facets. A type hierarchy is induced between 
scalar types by containment of facets. The algebra’s type system can be general- 
ized to support these types without much increase in its complexity. We added 
UrScalar, because XML Schema does not support a most general scalar type. 

A type is either: a type variable; a scalar type; an element type with literal 
tag a and content type t; a wildcard type with an unknown tag and content type 
t; a sequence of two types, a choice of two types; a repetition type; the empty 
sequence type; or the empty choice type. 

The algebra’s external type system, that is, the type definitions associated 
with input and output documents, is XML Schema. The internal types are in 
some ways more expressive than XML Schema, for example, XML Schema has no 
type corresponding to Integer* (which is required as the type of the argument 
to an aggregation operator like sum or min or max), or corresponding to ~ [t] 
where t is some type other than UrTree*. In general, mapping XML Schema 
types into internal types will not lose information, however, mapping internal 
types into XML Schema may lose information. 

4.3 Relating Values to Types 

Recall that data is the subset of expressions that consists only of scalar constant, 
element, sequence, and empty sequence expressions. We write \~ d : t if data d 
has type t. The following type rules define this relation. 
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tag 


a 




function 


f 




variable 


V 




integer 


Cint • 




string 


Cstr • 


11 fi 11^11 11^,11 . . 


boolean 


Cbool • 


:= false | true 


constant 


c : 


•— Cint 1 Cstr | Cbool 


operator 


op 


:= + 1 - 1 and | or 

1 =1 !=i <l<=i> 


expression 


e : 


c 



pattern 



P 



V 

a [el 
~elel 
e , e 

0 

if e then e else e 
let w = e do e 
for V in e do e 

case e of v:p => e I v => e end 
/(e;. . .;e) 

e : t 

empty (e) 
error 

e + e 
e = e 

children(e) 
name(e) 
e / a 

where e then e 
value (e) 

let V : t = e do e 



query item q 



data d 



I s 

type x = t 

I fun f (v :t; . . . ;v :t) :t = e 
I let V \ t = e 
I query e 
::= c 

I a[dl 
I d ,d 

I 0 



scalar constant 

variable 

element 

computed element 

sequence 

empty sequence 

conditional 

local binding 

iteration 

case 

function application 

explicit type 

emptiness predicate 

error 

plus 

equal 

children 

element name 

projection * 

conditional * 

scalar content * 

local binding * 

element 

wildcard 

scalar 

type declaration 
function declaration 
global declaration 
query expression 
scalar constant 
element 
sequence 
empty sequence 



Fig. 1. Algebra 
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tag a 

type name x 
scalar type s 



type t 



unit type u 



= 


Integer 






String 






Boolean 






UrScalar 




= 


X 


type name 




s 


scalar type 




a[t] 


element 




~m 


wildcard 




t , t 


sequence 




t 1 t 


choice 




t* 


repetition 




0 


empty sequence 




0 


empty choice 


= 


a[t] 


element 




~ltl 


wildcard 




s 


scalar type 



Fig. 2. Type System 



Cint • 


Integer 


Cgtr ■ 


String 


Cbool ■ 


Boolean 


h c : UrScalar 


h d 


: t 


h aW] 


: a[t] 


h d 


: t 



haW] : ~[tl 
\- di : ti h c?2 : ^2 



h di , c?2 ■■ h ,t2 



h 0 : 0 
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\~d 


: tl 


hd : 


tl 1 t2 


hd 


■ t2 


h d : 


(tl 1 ^ 2 ) 


: t 


hd2 


h (di,d2) : t* 



h 0 : t* 



We write ti <:t2 if for every data d such that he? : it is also the case 

that he? : that is ti is a subtype of t2- It is easy to see that <: is a partial 

order, that is it is reflexive, t <: t, and it is transitive, if ti <: t2 and t2 <'■ ta 
then t\ <: fa- We also have that 0 <: t for any type t, and a[t] <: ~[t]. We have 
s <: UrScalar for every scalar type s. We have ti <: (ti | ^2) and t2 C {t\ \ ^2) 
for any t\ and t2- If t <: t' , then a[t] <: a[t'\ and t * And if t\ <: t'l and 

t2 <: t'2 then ti,t2 C ^'1,^2- 

We write ti = ^2 if ti <: ^2 and ^2 <: ti- Here are some of the equations that 
hold. 

UrScalar = Integer | String | Boolean 
(^1,^2), fa =tij(t 2 ,t 3 ) 
t,{) =t 

{),t =t 

tl I ^2 = ^2 I ti 

(ti I ^ 2 ) I fa = ti I {h I fa) 

t I 0 =t 

0 I t =t 

tij(t2 I fa) = (tiih) I (ti,ta) 

{ti I t2),ta = (ti)ta) I (t2,ta) 

t ,0 =0 

0,t =0 

a[t] I ~[t] = ~[t] 

t* =01 t,u 

We also have that t\ <:t 2 if and only iff ti | t 2 = ^ 2 - 

We define tl and t+ as abbreviations, by the following equivalences. 



t? = 0 1 1 
t+ = t,t* 




36 



Mary Fernandez, Jerome Simeon, and Philip Wadler 



e/a 

=> for vi in e do ( 1 ) 

for V 2 in childrenCwi ) do 
case V 2 of 
V3 : a => V3 
I V4 => 0 



where ei then 62 

=> if ei then 62 else () ( 2 ) 

value (e) 

=> case children(e) of (3) 

Vi : UrScalar => 

I V 2 => V 2 '■ % 

let t f = ei do 62 

let V = (ei : t) do 62 (4) 



Fig. 3. Definitions 



5 Equivalences and Optimization 

5.1 Equivalences 

Figure 3 contains the laws that relate the reducible expressions (i.e., those labeled 
with in Figure 1) to equivalent expressions. In these definitions, el{e2/u} 
denotes the expression el in which all occurrences of v are replaced by e2. 

In Rule 1, the projection expression e/a is rewritten as described previously. 
Rule 2 rewrites a where expression as a conditional, as described previously. 
Rule 3 rewrites value (e) as a case expression which checks whether the content 
of e is a scalar value, and if so, returns it. If e is not scalar value, its value 
is returned with the empty choice type, which may indicate an error. Rule 4 
rewrites the let expression with a type as a let expression without a type by 
moving the type constraint into the expression. 

5.2 Optimizations 

Figure 4 contains a dozen algebraic simplification laws. In a relational query 
engine, algebraic simplifications are often applied by a query optimizer before 
a physical execution plan is generated; algebraic simplification can often reduce 
the size of the intermediate results computed by a query interpreter. The purpose 
of our laws is similar - they eliminate unnecessary for or case expressions, or 
they enable other optimizations by reordering or distributing computations. The 
set of laws given is suggestive, rather than complete. 
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E if [] then ei else 62 
I let V = [] do e 
I for t in [] do e 

I case [] of vi:p => ei I V2 => 62 end 



for u in 0 do e => 0 




(5) 


for V in (ei , 62) do 63 

=> (for V in ei do 63) , (for v in 


62 do 63) 


(6) 


for V in ei do 62 

e2{ei/r}, if e : u 




(7) 


case aleol of vi:a => ei 1 V2 => 62 
^ ei{a[eol /n} 


end 


(8) 


case a'leol of vi:a => ei 1 i>2 => 62 
^ 62(0' [eol /v2}, if a 7^^ a' 


end 


(9) 


for u in e do r e 




(10) 


F[if ei then 62 else 63] 

=> if ei then E[e2] else ^[es] 




(11) 


F[let r = ei do 62] 

=> let II = ei do E[e2] 




(12) 


F[for V in ei do 62] 

=> for V in ei do E[e2\ 




(13) 



_E[case eo of vi :p => ei I V2 => 62 end] 

=> case eo of vi :p => -B[ei] I V2 => E\e2\ end (14) 



Fig. 4. Optimization Laws 



Rules 5, 6, and 7 simplify iterations. Rule 5 rewrites an iteration over the 
empty sequence as the empty sequence. Rule 6 distributes iteration through 
sequence: iterating over the sequence ei , 62 is equivalent to the sequence of 
two iterations, one over Ci and one over 62. Rule 7 eliminates an iteration over 
a single element or scalar. If ei is a unit type, then ei can be substituted for 
occurrences of v in e^- 

Rules 8 and 9 eliminate trivial case expressions. 

Rule 10 eliminates an iteration when the result expression is simply the 
iteration variable v. 
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Rules 11-16 commute expressions. Each rule actually abbreviates a number 
of other rules, since the context variable E stands for a number of different 
expressions. The notation E[e] stands for one of the six expressions given with 
expression e replacing the hole [] that appears in each of the alternatives. For 
instance, one of the expansions of Rule 13 is the following, when E is taken to 
be for u in [] do e. 

for V 2 in (for vi in Ci do 62) do 63 

for vi in ci do (for V 2 in 62 do 63) 

Rules 7 and 10 together with the above expansion of Rule 13 are exactly 
analogous to the three monad laws used with list, bag, and set comprehensions 
in nested relational algebra [6,8,22,21] algebra, and derived from a similar use in 
functional programming [28]. In effect, these three laws show that the for loop 
introduced here is the analogue of a monad for semi-structured data. 

Note that the sophisticated type rule for for loops ensures that the left side of 
Rule 10 is well typed whenever the right side is. (Originally, a less sophisticated 
type rule was used, for which this is not the case.) 

In Section 3.1 we claimed that the expression bibO/book translates to 

for vl in bibO do 

for v2 in children(vl) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end 

and that this simplifies to 

for v2 in children(bibO) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end 

We can now see that the translation happens via Rule 1, and the simplification 
happens via Rule 7. 

In that Section, we also claimed that the expression bibO/book/ author 
translates to 

for v5 in (for v2 in children(bibO) do 
case v2 of 

v3 : book => v3 
I v4 => 0 
end) do 

for v6 in children(v5) do 
case v6 of 

v7 : author => v7 
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I v8 => 0 

end 

and that this simplifies to 

for v2 in children(bibO) do 
case v2 of 
v3 : book => 

for v6 in children(v3) do 
case c of 

v7 : author => d 
I v8 => 0 
end 

I v4 => 0 
end 

We can now see that the translation happens via two applications of Rule 1, and 
the simplification happens via Rule 7 and the above instance of Rule 13. 

To reiterate, these examples illustrate an important feature of the algebra: 
high-level operators may be defined in terms of low-level operators, and the low- 
level operators may be subject to algebraic laws that can be used to further 
simplify the expression. 



6 Type Rules 

We explain our type system in the form commonly used in the programming 
languages community. For a textbook introduction to type systems, see, for 
example, Mitchell [23]. 



6.1 Environments 

The type rules make use of an environment that specifies the types of variables 
and functions. The type environment is denoted by F, and is composed of a 
comma-separated list of variable types, v : t or function types, / : (ti; . . . ; t„) ^ 
t. We retrieve type information from the environment by writing (v : t) G F to 
look up a variable, or by writing (/ : (ti; . . . ; t„) ^ t) G F to look up a function. 

The type checking starts with an environment that contains all the types 
declared for functions and global variables. For instance, before typing the first 
query of Section 2.2, the environment contains: F = bibO : Bib,bookO : Book. 
While doing the type-checking, new variables will be added in the environment. 
For instance, when typing the query of section 2.3, variable b will be typed 
with Book, and added in the environment. This will result in a new environment 
F' = F,h : Book. 
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6.2 Type Rules 

We write The : t if in environment T the expression e has type t. 

The definition of for uses an auxiliary type judgement, given below, and the 
definition of case uses an auxiliary function, given below. 



r h 


Cint 


: Integer 


r Cgtr 


: String 


r h 


Cbool 


: Boolean 




(v :t) e r 




r h 


V : t 




rh 


e : t 



T h a [e] : a [t] 



T h ei : String T h 62 : t 



r \- eo : u u‘ 



r h ~ei [e2l : ~ W 




r h Cl : r h e2 : 


^2 


r h ei , 62 ■ tl , t2 




r h 0 : 0 




6i : Boolean T h 62 : ^2 




r h if 6i then 62 else 63 : 


(^2 1 ^3) 


r h 6 i : tl r, V : t\\- 62 




r h let u = 6i do 62 : 




r h ei : tl T; for u : ti h 




r h for V in ei do 62 : 


h 


1 t' = splif{u) r, v\ : u' \- 6 i 


: tl 



Th case eo of V\'.p => e2 I V2 => 63 end 



r, V 2 '-t' \- €2 : t 2 



(/ : ^ t) G T 

T h ei : t'l t'l <: ti 

]~' \ — p ' <T * / 



rh /(ei;...;e„) : t 
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r \~ e : t 

7 ^ h empty (e) : Boolean 



r h error : 0 



r \- e : t' t' <:t 
r \- (e : t) : t 



r \- ei : Integer _T h 62 : Integer 
L' h ei + 62 : Integer 

r \- ei : ti _r h 62 : ^2 

L' h 61 = 62 : Boolean 

_r h 6 : Integer* 
r h sum 6 : Integer 

r \- e : t 

r h count 6 : Integer 



r h error : 0 

The definition of for uses the following auxiliary judgement. We write F h 
V : tet' if in environment F where the bound variable of an iteration v has type 
t\ that the body e of the iteration hast type t 2 - 

F, V : u\- e : t' 

F; for V : u \~ e : t' 



F ; for V : 0 h e : () 

F; for V : ti \- e : F; for v : t 2 e : t '2 

F; for V : ti , t2 e : t[ , t'2 



F ; for u : 0 h 6 : 0 

F; for V : ti \- e : t'l F; for v : t2 e : t'2 
F; for V : ti \ t2 e : \ t'2 

F; for V : t \- e : t' 

F ; for V : t* \- e : t'* 
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To determine the types in a case expression, we use the function split^it), 
where p is a pattern (either an element a, or a wildcard ~, or a scalar s) and t 
is a type. For mnemonic convenience we write a[t'] \ t” = splif‘{t) or ~[t'] \ t" = 
split (t) or s'<:s I = splif{t) but one should think of the function as returning 
a pair consisting of two types t and t' , or in the last instance a scalar type s' 
and a type t' . The function splif{t) is undefined if type t involves sequencing, 
since a case expression acts on elements or scalars, not sequences. 

splif'(s) = a[0] I s 

splif lalt]) = a[t] I 0 

splif la'll]) = a[0] | a'[t] if a yf a' 

splif'l~[t\) = a[t] I a[t] 

spiwlti I t 2 ) = I ty I {t'l I t' 2 ) where a[t'] | t'[ = splif'{ti) 
splif'\%) = a[0] I 0 

split (s) = ~[0] I s 

split (a[t]) = ~[t] I 0 

split (~[t]) = ~[t] I 0 

split {ti I ^ 2 ) = ~[t'i I t' 2 ] I {t'{ I t' 2 ) where ~[t'] | t'( = split (ti) 
split (0) = ~[0] I 0 

splif(s') = s' <: s I 0 if s' <: s 

= 0 <: s I s' otherwise 

splif{a[t]) = 0 <: s I a[t] 
splif{~[t]) =0<:sj~|t] 

splif {ti I ^ 2 ) = (si I S 2 ) <: s I I t' 2 ) where s* <: s | t' = splif{ti) 
splif {tt>) = 0 <: s I 0 

6.3 Top-Level Expressions 

We write T h q if in environment F the query item q is well- typed. 

r h type X = t 

T, : ti, . . . , h e : t' t' <:t 

r ^ f (vi:ti; . . . ;vn:t„) :t = e 

r \- e : t' t' <:t 
r h let V : t = e 

r \- e : t 
r h query e 

We extract the relevant component of a type environment from a query item 
q with the function environment{q) . 
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environment{type x = t) = () 

environment{f\m f(vi :ti; . . . ; Vn -tn) :t) = f : (ti; . . tn) t 
environment{let v : t = e) = v : t 

We write h gi . . . if the sequence of query items is well typed. 

r = environment{qi) , environment{qn) 

r \- qi • • • r \- qn 

\- qi...qn 



7 Discussion 

The algebra has several important characteristics: its operators are orthogonal, 
strongly typed, and they obey laws of equivalence and optimization. 

There are many issues to resolve in the completion of the algebra. We enu- 
merate some of these here. 

Data Model. Currently, all forests in the data model are ordered. It may be 
useful to have unordered forests. The distinct operator, for example, produces 
an inherently unordered forest. Unordered forests can benefit from many opti- 
mizations for the relational algebra, such as commutable joins. 

The data model and algebra do not define a global order on documents. 
Querying global order is often required in document-oriented queries. 

Currently, the algebra does not support reference values, which are defined 
in the XML Query Data Model. The algebra’s type system should be extended 
to support reference types and the data model operators ref and deref should 
be supported. 

Type System. As discussed, the algebra’s internal type system is closely related to 
the type system of XDuce. A potentially significant problem is that the algebra’s 
types may lose information when converted into XML Schema types, for example, 
when a result is serialized into an XML document and XML Schema. 

The type system is currently first order: it does not support function types 
nor higher-order functions. Higher-order functions are useful for specifying, for 
example, sorting and grouping operators, which take other functions as argu- 
ments. 

The type system is currently monomorphic: it does not permit the definition 
of a function over generalized types. Polymorphic functions are useful for fac- 
toring equivalent functions, each of which operate on a fixed type. The lack of 
polymorphism is one of the principal weaknesses of the type system. 

Operators. We intentionally did not define equality or relational operators on 
element and scalar types undefined. These operators should be defined by con- 
sensus. 

It may be useful to add a fixed-point operator, which can be used in lieu of 
recursive functions to compute, for example, the transitive closure of a collection. 
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Functions. There is no explicit support for externally defined functions. 

The set of builtin functions may be extended to support other important 
operators. 

Recursion. Currently, the algebra does not guarantee termination of recursive 
expressions. In order to ensure termination, we might require that a recursive 
function take one argument that is a singleton element, and any recursive invo- 
cation should be on a descendant of that element; since any element has a finite 
number of descendants, this avoids infinite regress. (Ideally, we should have a 
simple syntactic rule that enforces this restriction, but we have not yet devised 
such a rule.) 
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Abstract. In 1935, van der Corput asked the following question: Given 
an infinite sequence of reals in [0, 1], define 



D(n) 



sup 

0<a:<l 



ISn n [0, x] \ —nx , 



where S„ consists of the first n elements in the sequence. Is it possible for 
D{n) to stay in 0(1)? Many years later, Schmidt proved that D{n) can 
never be in o(logn). In other words, there are limitations on how well 
the discrete distribution, x |S„ Cl [0,*] |, can simulate the continuous 
one, X nx. The study of this intriguing phenomenon and its numer- 
ous variants related to the irregularities of distributions has given rise 
to discrepancy theory. The relevance of the subject to complexity theory 
is most evident in the study of probabilistic algorithms. Suppose that 
we feed a probabilistic algorithm not with a perfectly random sequence 
of bits (as is usually required) but one that is only pseudorandom or 
even deterministic. Should performance necessarily suffer? In particular, 
suppose that one could trade an exponential-size probability space for 
one of polynomial size without letting the algorithm realize the change. 
This form of derandomization can be expressed by saying that a very 
large distribution can be simulated by a small one for the purpose of the 
algorithm. Put differently, there exists a measure with respect to which 
the two distributions have low discrepancy. The study of discrepancy the- 
ory predates complexity theory and a wealth of mathematical techniques 
can be brought to bear to prove nontrivial derandomization results. The 
pipeline of ideas that flows from discrepancy theory to complexity the- 
ory constitutes the diserepancy method. We give a few examples in this 
survey. A more thorough treatment is given in our book [15]. We also 
briefly discuss the relevance of the discrepancy method to complexity 
lower bounds. 



1 Facts from Discrepancy Theory 

Let (U, 5) be a set system, where V = {ui, . . . ,u„} is the ground set and S = 
{S'!, . . . , S'm}, with Si C V. We wish to color the elements of V red and blue so 

* Proceedings of FSTTCS-2000. This work was supported in part by NSF Grant CCR- 
96-23768, ARO Grant DAAH04-96-1-0181, and NEG Research Institute. 
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that, within each Si, no color greatly outnumbers the other one. To do that, we 
choose a function x that maps each vj G y to an element in { — 1, 1}, and we 
define the discrepancy of the set Si to be 

x{Si) = x{vj)- 

VjGSi 

The maximum value of |x(S'i)|, over all Si G S, is the discrepancy of the set 
system under the given coloring. The discrepancy of the set system itself, denoted 
by Dao{S), refers to its minimum discrepancy under all possible colorings. The 
norm creates an easier environment to work with, and so we also define 

D 2 {S) min \/x(S'i)2 H h , 

X 

where the minimum is taken over all colorings x ■ V { — 1,1}. The discrepancy 

can be characterized by using matrices, which is sometimes more convenient. Let 
A be the incidence matrix of the set system (V,5): the n columns are indexed 
by the elements of V and the m rows are the characteristic vectors of the sets 
Si, so that Aij is 1 if Vj G Si and 0 otherwise. The discrepancy of the set system, 
also denoted by Doo{A), can be expressed as the L°° norm of a column vector. 
Generally, for any p G {1, 2, . . . , oo}, we have Dp{A) = min 2 ;g{_i ijn ||yla;||p. 
The following result of Spencer [44] is tight. 

Theorem 1. Any set system (y, 5) such that \V\ = |5| = n has 0{^/n) dis- 
crepancy. 

For general set systems with m sets, the bound becomes ln(2m/n) ). A 

simple, elegant result concerns the case of small-degree set systems. The degree 
refers to the maximum number of sets containing a given element. The classical 
Beck-Fiala theorem [7] states that: 

Theorem 2. The discrepancy of a set system of degree at most t is less than 
2t. 

Techniques for proving lower bounds often involve spectral arguments and, in 
particular, harmonic analysis. The latter comes from the fact that set systems are 
often defined by using a convolution operator, which the Fourier transform diag- 
onalizes. Bounding the eigenvalues gives us a handle on the L^-norm discrepancy. 
Perhaps the simplest result obtained in this manner is Roth’s ^-Theorem [40]. 

Theorem 3. Any two-coloring of the integers {1, . . .,n} contains an arithmetic 
progression whose discrepancy is 

There exists a wealth of techniques and results for geometric set systems. In 
such cases, it is useful to define the notion of volume discrepancy. Consider the 
problem of placing a set P of n points in the unit cube [0, 1]*^ to minimize the 
discrepancy with respect to axis-parallel boxes. The (volume) discrepancy of a 
box B = nfc=i [Pk,qk) is defined as 
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Theorem 4. There is a set of n points in [0, 1]*^ such that the volume discrep- 
ancy of any box in [0, is 0{logn)'^~^ in absolute value. 

Here is a construction in two dimensions [46,47]. Given a nonnegative integer 
TO, let X)i>o 2* be its binary decomposition, and let 

= e [0,1). 

i>0 ^ 



The numbers o:i(to), for 0 < m < n, form the classical van der Corput sequence. 
We can use it to define the bit-reversal point set: 



I {xi{m),m/n) 



0 < m < n 



}■ 



This easily generalizes to d dimensions. Choose d — 1 relatively prime numbers: 
2 = pi,p 2 , ■ ■ ■ ,Pd-i- The integer to has a unique decomposition in base pk, 
TO = X)i>o ^k{i)Pk, so we can define 



Xk{m) 



E 



bk(i) 



i>0 Pk 



The point set 



P = I (xi{m), . . : 0 < TO < 

is called Halton- Hammer sley [25] and satisfies Theorem 4. 

What about the norm? Let P be a set of n points 
Given a box Bq of the form [0, gi) x [0, ( 72 ), where q = (gi, 52 ), the discrepancy 
of Bq is 

D{Bq) = n ■ area(Pq) — |P n Bq\. 

We define the L^-norm discrepancy of P as 



n} 

in the unit square. 



D2{P) = Jf D{Bqydq. 

V "'[0.1]=' 

The following result is by Davenport [22]. 

Theorem 5. It is possible to find a set P of n points in [0, 1]^ such that P 2 (P) = 
O(Vlogn). 



We forsake the Halton-Hammersley construction and, instead, turn to a con- 
struction based on irrational lattices. Take the set oi n = 2k — 1 points of the 
form 
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for all j (|j| < k), where {x\ x (mod 1) is the fractional part of x and 
+ 1) is the golden ratio. The only property we use about the golden 
ratio is the size of the partial quotients of its continued fraction expansion, so 
many other choices exist for Theorem 5. 

We generalize the discrepancy to in the obvious manner. Given a point 
q = {qi, . . . , < 7 d) in the unit cube [0, l]'^, let Bq denote the box [0, gi) x • • • x [0, qd)- 
Fix a set P of n points in [0, l]*^, and as usual define the volume discrepancy 
D{Bq) at a point q G [0, as D{Bq) = nqi ■ ■ ■ qd—\Pf]Bq\. We write D 2 {P) = 

\J /[o 1 ]“* following bound is due to Roth [39], and shows the 

optimality of Theorem 5. 

Theorem 6. Given a set P of n points in [0,1]^^, the mean-square discrepancy 
for axis-parallel boxes satisfies 

P2(R) >c(logn)(‘^-i)/2, 

for some constant c = c{d) > 0. 

In two dimensions, we have this interesting lower bound by Schmidt [41], 
which shows a rare divergence between and L°° behaviors. 

Theorem 7. Given n points in [0, 1]^, there exists a box B such that \D{B) \ = 
l7(logn). 

We now consider rotated boxes. Given a set P of n points in [0,1]^, the 
discrepancy of a (rotated) box R is defined naturally as D{R) = n • area(P n 
[0, 1]^) — |Pn P|. By rotated box, we mean any rectangle not necessarily parallel 
to the axes. The following upper bound was established by Beck; see Beck and 
Ghen’s book [6]. 

Theorem 8. It is possible to place n points in the unit square [0, 1]^, so that 
any (rotated) box R satisfies ]P(P)1 = 0{n^^‘^\/logn). 

A quasi-matching lower bound was first proven by Beck [5] , using his beautiful 
Fourier transform method (other proof techniques exist). 

Theorem 9. Given n points in the unit square [0, 1]^, there exists a rotated box 
R such that ]P(P)1 = I7(n^/^). 

The same bound holds for disks as well. The proof, by Montgomery [34,35], 
also uses harmonic analysis. 

Theorem 10. Given n points in the unit square [0, 1]^, there exists a disk K 
such that \D{K) \ = 
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2 Sampling 



The red-blue discrepancy of a set system tells us how well we can sample its 
ground set by choosing about half of its elements. What about different sample 
sizes? For example, given a collection of n points in the plane, is it possible to 
choose a subset of constant size, such that any disk that encloses at least one 
percent of the points also includes at least one sample point? Surprisingly, the 
answer is yes. The surprise is that the sample size can be kept independent of 
n. The magic lies in the notion of VC dimension. 

Let (V,S) be a (finite or infinite) set system. Given Y C V, let (V, 5|v) 
denote the set system induced by V, ie, { V n S' | S' G 5 }. A subset Y of V is 
said to be shattered (by S) if S|y = 2^, ie, every subset of Y (including the 
empty set) is of the form V n S, for some S G S. The supremum of all sizes of 
finite shattered subsets of X is called the Vapnik-Chervonenkis dimension (or 
VC-dimension for short) of the set system. 

Let (V, S) be a finite set system, where \V\ = n and |S| = m. Given any 
0 < e < 1, a set ?V C V is called an e-net for {V, S) if ?V n S yf 0, for any S G S 
with |S|/|y| > £. A set A C V is called an e- approximation for (V, 5) if, for any 
SgS, 



1^1 



lAnS'l 

1^1 



< e. 



Equivalently, given a random v uniformly distributed in V, for each S G S, 



Prob[u G S'] — Prob[u G S' | u G A] 



< e. 



The following was proven by Chazelle and Matousek [18], building on the foun- 
dational work in [16,26,29,48]. 

Theorem 11. Let (P, S) be a set system of VC-dimension d. Given any r >2, 
a {1 /r)- approximation for (P, S) of size 0{dr'^ logdr) can he computed in time 
0{df<^{r‘^\ogdrY\V\. 



Theorem 12. Let (P, S) he a set system of VC-dimension d. Given any r >2, a 
(l/r)-net for (P, S) of size O(drlogdr) can he computed in time 
0{dY'^{r^ log drY\V\. 

Note that the set systems are usually understood as members of an infinite 
family; for example the set of all points in and the set of all disks. The term 
range space is often used in the literature to refer to such a family. 



3 Geometric Algorithms 

Suppose that we are given a set H of n hyperplanes in We wish to subdivide 
R'^ into a small number of simplices, so that none of them is cut by too many 
hyperplanes. Given a parameter e > 0, a collection C of closed full-dimensional 
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simplices is called an e-cutting if: (i) their interiors are pairwise disjoint, and 
together they cover and (ii) the interior of any simplex of C is intersected 
by at most en hyperplanes of H . 

Cuttings are among the most useful, versatile tools in computational ge- 
ometry, as they lay the grounds for efficient divide-and-conquer [1,2,20,26,27]. 
Using some of the sampling technology for finite VC dimension discussed earlier, 
Chazelle [11] proved the following: 

Theorem 13. Given a collection H of n hyperplanes in for any r > 0, 
there exists a {l/r)~ cutting for H of optimal size A full description of the 

cutting, including the list of hyperplanes intersecting the interior of each simplex, 
can he found deterministically in 0{nr‘^~^) time. 

Here are some direct applications of cuttings: Point location is understood 
here as the problem of preprocessing an arrangement of n hyperplanes in R*^ so 
that, given a query point, the face of the arrangement that contains the point 
can be found quickly. Simplex range searching is the problem of preprocessing n 
points in R*^ so that given a query simplex the points inside it can be counted 
quickly. 

Theorem 14. Point location amongn hyperplanes in R*^ can he done in O(logn) 
query time, using 0(n‘^) preprocessing. 



Theorem 15. To decide whether n points and n lines in the plane are free of 
any incidence can he done in • O(logn)^/^ time. 



Theorem 16. Given n points in R'^, there exists a data structure of size m 
(for any n <m < n'^), which allows simplex range searching to he done in time 
per query, for any fixed £ > 0. 

A far more involved application of cuttings and the discrepancy method gives 
the following result (and its corollary), which was proven by Chazelle [12]. The 
complexity is tight in the worst case. 

Theorem 17. The convex hull of a set of n points in R*^ can he computed 
deterministically in Ofnlogn -\- time, for any fixed d > 1. 



Theorem 18. The Voronoi diagram of a set of n points in can he computed 
deterministically in Ofnlogn -\- time, for any fixed d > 1. 

Applications to linear and quadratic programming include the following re- 
sults by Chazelle and Matousek [18]. 

Theorem 19. The ellipsoid of minimum volume that encloses a set of n points 
in R^ can he computed in time d^^‘^ ^n. 
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Theorem 20. Linear programming with n constraints and d variables can he 
solved in time. 

These last two results build on important previous work. In particular, we 
mention the general formalism for linear programming developed by Sharir and 
Welzl [43], known as LP-type. The first algorithm for linear programming with a 
running time linear in the number of constraints was found by Megiddo [32,33]. 
Subsequent improvements were found in [19,21,23,24,42]. 

4 Linear Circuit Complexity 

Let A be an n-by-n matrix with 0/1 elements. Consider the task of assembling 
A by forming a sequence of column vectors Ui, . . . ,Us € Z", where s > n and 
(i) {U\, . . . ,Un) is the n-by-n identity matrix; (ii) A = {Us-n+i, ■ ■ ■ ,Us)] and 
(iii) for any i = n + l,...,s, there exist j,k < i and ai,f3i G Z, such that 
Ui = aJJj + PiUk- The minimum length s of any sequence that satisfies these 
three conditions is called the complexity of A. It is easy to see that all 0/1 matrices 
have complexity 0(jP) and that a random one has complexity f2(jp / log n). 

The complexity of A is the same as the linear circuit complexity of computing 
A^x. (A circuit consists of gates that can add linear forms.) For the case where 
\oh\i \Pi\ = 0(1) (which is to be understood from now on), Chazelle’s spectral 
lemma [14] gives us a line of attack: 

Lemma 1. The complexity of an n-by-n 0/1 matrix A is l7(maxfe klogXk), 
where Xk is the k-th largest eigenvalue of A^A. 

Of course, the same lemma applies to the circuit complexity as well. A recent 
variant by Chazelle and Lvov [17] gives us another powerful tool which bypasses 
the need to bound individual eigenvalues. 

Lemma 2. The complexity of an n-by-n 0/1 matrix A is 
Ce^nlog {tr M/n — £i/tr M^/n^^, 

where M = A^ A and e > 0 is an arbitrarily small constant. 

The complexity of range searching relates to the complexity of certain ge- 
ometric matrices. A box matrix refers to a set system formed by points and 
axis-parallel boxes. Simplex matrices, on the other hand, denote the incidence 
matrices of set systems formed by points and simplices in The following 
results, by Chazelle [9,10,13,14], make heavy use of the discrepancy method. 

Theorem 21. There are n-by-n box matrices of circuit complexity C(nloglogn) 
in and monotone circuit complexity l7(n(logn/loglogn)‘^“^) in R'^. 

Theorem 22. There are n-hyxi simplex matrices of circuit complexity fi{n\ogn) 
and monotone circuit complexity Q {A/ in R2. 

Recall that the monotone circuit model disallows the use of subtraction. 
While the monotone complexity of these problems is essentially resolved (there 
are quasi-matching upper bounds), the nonmonotone case is still wide open. 
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Abstract. A metalogical framework is a logic with an associated method- 
ology that is used to represent other logics and to reason about their 
metalogical properties. We propose that logical frameworks can be good 
metalogical frameworks when their logics support reflective reasoning 
and their theories always have initial models. 

We present a concrete realization of this idea in rewriting logic. Theo- 
ries in rewriting logic always have initial models and this logic supports 
reflective reasoning. This implies that inductive reasoning is valid when 
proving properties about the initial models of theories in rewriting logic, 
and that we can use reflection to reason at the metalevel about these 
properties. In fact, we can uniformly reflect induction principles for prov- 
ing metatheorems about rewriting logic theories and their parameterized 
extensions. We show that this reflective methodology provides an effec- 
tive framework for different, non-trivial, kinds of formal metatheoretic 
reasoning; one can, for example, prove metatheorems that relate theo- 
ries or establish properties of parameterized classes of theories. Finally, 
we report on the implementation of an inductive theorem prover in the 
Maude system, whose design is based on the results presented in this 
paper. 



1 Introduction 

A logical framework is a logic with an associated methodology that is employed 
for representing and using other logics, theories, and, more generally, formal sys- 
tems. A number of logical frameworks have been proposed and to compare them 
and analyze their effectiveness, it is helpful to distinguish between their intended 
applications. In particular, we can distinguish between logical frameworks, where 
the emphasis is on reasoning in a logic, in the sense of simulating its derivations 
in the framework logic, and metalogical frameworks, where the emphasis is on 
reasoning about logics and even about relationships between logics. Metalogi- 
cal frameworks are more powerful, as they include the ability to reason about 
a logic’s entailment relation, as opposed to merely being adequate to simulate 
ent ailment. 

Induction plays a central role in distinguishing logical frameworks from their 
metalogical counterparts. In a logical framework, representations of proof rules 
are used to construct derivations of (object logic) entailments. This approach is 
taken in logical frameworks like Isabelle [34] and the Edinburgh LF [22] . There, 
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one may formalize logics and theories where induction is present within the 
theory (e.g., Peano Arithmetic), but induction is not present over the theories. 
That is, the framework does not support induction over the terms and proofs of 
a theory. In contrast, in a metalogical framework, it is essential to have induction 
over theories. Standard proof-theoretic arguments usually require induction over 
the formulae or derivations of the object theory. Induction is essential too for 
computer science applications, like reasoning about operational semantics. 



1.1 Reflective Metalogical Frameworks 

In this paper, we propose a new approach to metalogical frameworks motivated 
by the following observation. A logic’s syntax and proofs can be viewed as alge- 
bras, whose carrier sets are inductively built from syntax and proof constructors. 
A logical framework and a metalogical framework can share these as a common 
basis. However, whereas for a logical framework the application of these con- 
structors suffices to simulate derivations of the object logic, for a metalogical 
framework, our representation must additionally preserve the inductive nature 
of these algebras. That is, a formalization in the metalogic should have an initial 
model corresponding to the syntax and proofs of the formalized object logic. 

Our proposal is that for some logical frameworks — namely, those that are 
reflective and whose theories have initial models — we can take the step from a 
logical framework to a metalogical framework by reflecting at the metalevel the 
induction principles for the formalized logics. We sum this up with the slogan 
^dogical frameworks with reflection and initiality are metalogical frameworks''. 

After making this idea precise, we give a concrete realization of it using 
rewriting logic and present an example. Our example is a standard one in metar- 
easoning: the deduction theorem for minimal logic (of implication). Rewriting 
logic is not the only candidate for a reflective metalogical framework, but we 
believe it is a good one. Rewriting logic has been demonstrated to be a good 
logical framework [11,23,24,30,36,37] and it is balanced on a point where it is 
expressive enough to naturally formalize different entailment systems, but it is 
weak enough so that its theories always have initial models. This means that 
there are sound induction principles for reasoning with respect to these models. 
To prove metatheorems about theories in rewriting logic and their parameterized 
extensions, the key is to reflect these reasoning principles at the metalevel. 

Overall, we see our contributions as both theoretical and practical. Theo- 
retically, our work contributes to answering the question “what is a metalogical 
framework?" by proposing reflective logical frameworks, whose theories have ini- 
tial models, as a possible answer. Moreover, it illuminates the interrelationship 
between logical and metalogical frameworks, and the role of reflection as a key 
ingredient for turning a logical framework with initial models into a metalogical 
one. Practically, we provide evidence that rewriting logic, combined with reflec- 
tion, is an effective metalogical framework that can be used for nontrivial kinds 
of metatheoretic reasoning. 
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1.2 Related Work 

Various approaches have been considered in the past to strengthen logical frame- 
works so that they can function as metalogical frameworks. All of these differ 
significantly from our proposal both in their logical basis and in the role of 
reflection in metareasoning. 

One approach is to formalize theories in a framework logic supporting some 
notion of module, where each module is explicitly equipped with its own in- 
duction principle. For example, in [3], theories were formalized by collections of 
parameterized modules (A-types) within the Nuprl type theory (a constructive, 
higher-order logic), and each module included its own induction principle for 
reasoning about terms or proofs. This approach is powerful and can be used, for 
example, to relate different theories formalized in this way. 

An alternative approach is to formalize theories directly using inductive def- 
initions in a framework logic or framework theory that is strong enough to for- 
malize the corresponding induction principles. A simple example of this is the 
first-order theory FSq of [19], which has been used by [25] to carry out ex- 
periments in formal metatheory. In FSq, inductive definitions are terms in the 
framework theory, which has an induction rule for reasoning about such terms. 

Another common choice is to formalize theories as inductive definitions 
in strong “foundational” framework logics such as higher-order logic or set- 
theory [21,33], or in a type theory like the calculus of constructions with in- 
ductive definitions [32]. In higher-order logic and set theory one can internally 
develop a theory of inductive definitions, where inductive definitions correspond 
to terms in the metatheory (e.g., formalized as the least fixedpoint of a mono- 
tonic function) and, from the definition, induction principles are formally derived 
within the framework logic. Alternatively, in the calculus of constructions, given 
an inductive definition, induction principles are simply added, soundly, to the 
metalogic. Current research in this area focuses on appropriate induction prin- 
ciples for logics that support higher-order abstract syntax [17,26,35]. 

Organization 

The remainder of our paper is organized as follows. In Section 2 we present the 
idea of a reflective metalogical framework and abstractly formalize our require- 
ments for such a metalogic. In Section 3 we present background material on 
rewriting logic, membership equational logic, and the Maude language. In Sec- 
tion 4 we discuss induction principles for membership equational theories and 
present a simple notion of parameterized membership equational theory. In Sec- 
tion 5 we discuss how rewriting logic can be used as a logical framework, and in 
Sections 6 and 7 we show how to combine initiality and reflection to use rewrit- 
ing logic as a metalogical framework. In particular, we show how to reflect, in 
a uniform way, induction principles for reasoning, at the metalevel, about the- 
ories and their parameterized extensions. After this, we present in Section 8 an 
example of formal metareasoning using rewriting logic as a metalogical frame- 
work, namely the proof of the deduction theorem, and we draw conclusions in 
Section 9. 
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2 Reflective Metalogical Frameworks 

In this section we begin by defining refiective logics. Based on this, we then 
describe properties sufficient for a refiective logical framework to function as a 
refiective metalogical framework. 



2.1 Reflective Logics 

Intuitively, a reflective logic is a logic in which important aspects of its metathe- 
ory, such as theories and entailment, can be represented and reasoned about in 
the logic. A general axiomatic notion of reflective logic was recently proposed 
in [7,14]. The notion is itself expressed in terms of the more general axiomatic 
notion of an entailment system [27], which captures the entailment relation of a 
logic. For our purposes here, an entailment system £ consists of the following: 

1 . a class Sign of signatures, where each signature S G Sign specifies the syntax 
of a language; 

2. a function sen assigning to each signature E G Sign a set sen{E) of its 
sentences; 

3. for each signature E G Sign, an entailment relation hi;, where hi; C 
V{sen{E)) x sen{E) and hi; satisfies the properties of reflexivity, mono- 
tonicity, and transitivity (or cut); in what follows, we omit the subscript of 
hi; when E is clear from the context. 

A theory in £ = {Sign, sen, h) is then a pair T = {E, F) consisting of a signature 
E G Sign and a set of sentences F C sen(E). We can extend the entailment 
relation to theories in the obvious way by defining {E, T) h iff T h tp, for 
(fi G sen{E). 

Definition 1 Given an entailment system £ and a nonempty set of theories C 
in it, a theory U is C -universal if there is a function, called a representation 
function, 

(- -) : U ^ sen{T)) — > sen{U) , 

Tec 

such that for each T G C,ip G sen{T), 



Thy, iff UhTh^. (1) 

If, in addition, U G C, then the entailment system £ is called C-reffective. Fi- 
nally, a reflective logic is a logic whose entailment system is C-refiective for C, 
the class of all finitely presentable theories in the logic. 
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2.2 Requirements for a Reflective Metalogical Framework 

We now consider what we require from a logical framework so that it can func- 
tion as a metalogical framework. As indicated in Section 1.2, various approaches 
to formal metareasoning have been proposed in the past. Our approach is based 
on reflective reasoning and initiality, and here we present, abstractly our require- 
ments for this. They are: 

1. the logical framework is weak enough so that there are valid induction prin- 
ciples for reasoning about all its theories, 

2. the logical framework is expressive enough so that it really is a viable logical 
framework, and 

3. the logical framework is reflective. 

Note that 1 specifies a requirement on the framework logic and can be alter- 
natively formulated in an abstract and logic-independent way. If the framework 
logic is such that its theories have initial models, then an appropriate form of 
inductive reasoning is always valid when proving sentences with respect to the 
initial models of its theories. This method is very general; for example, for equa- 
tional logic, induction and initiality are equivalent concepts [31].^ 

We now explain why the requirements listed above are sufficient for turning a 
logical framework into a metalogical framework. If requirement 2 is satisfied, then 
logics and their entailment relations can be represented as theories in the logical 
framework, and if requirement 1 is also satisfied, then these representations can 
preserve the inductive nature of the algebras characterizing the syntax and proofs 
of the logics that they represent. As a consequence, proof-theoretic arguments 
requiring induction over the formulae or over the derivations of a logic can be 
applied in the framework logic. This is enough when proving theorems about a 
logic. These theorems can be formalized as sentences about the initial model of 
the theory representing the object logic under consideration, and can be proved 
by induction. 

However, when dealing with metatheorems we often require something more. 
Metatheorems may relate different logics in a family of logics. Consider, for ex- 
ample, the deduction theorem for minimal logic (of implication) . This is actually 
a metatheorem not about a particular deduction system, but rather a metathe- 
orem that relates different deduction systems: one in which A B is proved 
and a second (which is obtained from the first by adding the axiom A) in which 
B is proved. In our setting, this means that sentences formalizing metatheorems 
should relate initial models of different theories. Here is where reflection plays a 
decisive role. Namely, if the logical framework satisfies requirement 3, then: (3a) 

^ Since the notion of initiality is very general, the corresponding inductive reasoning 
principles may in each case take different forms. For example, in an equational logic 
allowing infinitary operations of arity smaller than a given regular cardinal a, the 
inductive principles will be transfinite. We are mainly interested in logical frameworks 
suitable for representing finitary logics. Therefore, we will in practice be interested 
in a finitary framework logic whose theories have initial models and whose induction 
principles are also finitary. 
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it contains a universal theory where the metalevel of its logic can be reflected, 
and (3b) the universal theory is itself a theory in the logical framework. By 3b 
and 1 the universal theory has an initial model. The key then is to exploit 3a in 
order to formalize relationships between the initial models of object theories as 
theorems about the initial model of the universal theory, and turn, by reflection, 
induction principles for reasoning about the initial models of object theories into 
induction principles for reasoning about the initial model of the universal theory. 

In the following sections, we will give a concrete instance of these ideas for 
the case of rewriting logic. In particular, we will show that for a certain class of 
rewriting logic theories the induction principles for reasoning, at the metalevel, 
about these theories correspond, in a simple way, to the induction principles for 
reasoning about the inductive properties of the theories. Moreover, by reasoning 
by induction in the universal theory we can inductively reason about properties 
satisfied by families of theories. This provides us with capabilities analogous to 
what is possible in metalogical frameworks based on parameterized inductive 
definitions. 



3 Background 

In this section we provide background material on rewriting logic, membership 
equational logic, and the Maude language. The material presented here is stan- 
dard. We postpone discussion of the reflective aspects to Section 6. 



3.1 Rewriting Logic 

Rewriting logic [28] is a simple logic whose sentences are sequents of the form 
t — > t', with t and t' 17-terms on a given signature 17. Theories in rewriting logic 
are triples (f2,E,R), with 17 a signature of operators, E a set of 17-equations, 
and R a collection of (possibly conditional [28]) l7-rewrite rules. 

The inference rules of rewriting logic [28] allow the derivation of all rewrites 
possible in a given theory. Thus, from the logical point of view, we can think of 
rewriting logic as a framework logic in which formulae are formalized as elements 
of the initial model of an equational theory (17, E) and an inference system is 
formalized by expressing each inference rule as a (possibly conditional) rewrite 
rule. Rewriting is understood modulo the equations E. This supports a flexible 
and abstract kind of inference where the equations can take care of structural 
bookkeeping. For example, when formalizing sequent calculi, structural rules 
for sequents can be “internalized” by rewriting modulo appropriate equational 
axioms such as associativity, associativity-commutativity, and so on. 

Since a rewrite theory (12, E, R) has an underlying equational theory (12, E), 
rewriting logic is parameterized by the choice of the equational logic. An attrac- 
tive choice in terms of expressiveness is membership equational logic [29], a logic 
that has sorts, subsorts, overloading of function symbols, and is capable of ex- 
pressing partiality using equational conditions. Since we can view an equational 
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theory (f2,E) as a rewrite theory there is an obvious sublogic inclu- 

sion, MEqtl C RWLogic, from membership equational logic into rewriting logic. 
Both membership equational logic and rewriting logic have initial models [28,29], 
which provide the basis for reasoning by induction. 



3.2 Membership Equational Logic 

Membership equational logic is an expressive version of equational logic. A full 
account of the syntax and semantics of membership equational logic can be found 
in [6,29]. Here we define the basic notions needed in this paper. 

A signature in membership equational logic is a triple 17 = (K, S, S) with K a 
set of kinds, E a AT-kinded signature E = {Eu]^k\(w,k)£K'ycK} Etnd S = {S'fcjfceif 
a pairwise disjoint iL-kinded family of sets. We call Sk the set of sorts of kind 
k. The pair (AT, E) is what is usually called a many-sorted signature of function 
symbols; however we call the elements of AT kinds because each kind k now has 
a set Sk of associated sorts, which in the models will be interpreted as subsets of 
the carrier for the kind. Also, as usual, we denote by Ts the A'-kinded algebra 
of ground A-terms, and by Ts{X) the algebra of A-terms on the A'-kinded set 
of variables X. 

The atomic formulae of membership equational logic are either equations 
t = t' , where t and t' are A-terms of the same kind, or membership assertions 
of the form t : s, where the term t has kind k and s G Sk- Sentences are Horn 
clauses on these atomic formulae, i.e., sentences of the form 

V(xi , . . . , Xjn') - A ... A -A-fi Ag , 

where each Ai is either an equation or a membership assertion, and each Xj is a 
AT-kinded variable. For example. Figure 1 gives a set of membership equational 
axioms specifying minimal logic of implication, where SentConstant, Formula, 
and Theorem are sorts formalizing sentential constants, formulae, and theorems, 
respectively.^ A theory in membership equational logic is a pair (f2,E), where 
A is a finite set of sentences in membership equational logic over the signature 
Q. The way in which partiality is expressed in membership equational logic is 
by the fact that terms always have a kind, but may not have a sort. Terms for 
which a sort cannot be established from the axioms E correspond to undefined 
or error elements. 

We employ standard semantic concepts from many-sorted logic. Given a sig- 
nature f2 = (K,E,S), an f2-algebra is a many-kinded A-algebra (that is, a 
AT-indexed-set A = {Ak}k^K together with a collection of appropriately kinded 
functions interpreting the function symbols in A) together with an assignment 
to each sort s G Afe of a subset Ag C Ak- Hence, sorts can be thought of as 
unary predicates that semantically denote subsets of the appropriate kind. An 
algebra A and a (kind-respecting) valuation a, assigning to variables of kind 

^ Note that we write the object logic connective — >• in infix. We will consider this 
example in more detail in Section 5. 
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Vj 4. M : SentConstant => Formula, 

\/A. A : Theorem => A : Formula, 

'i{ A,B). M : Formula A B : Formula => A—^B : Formula, 
y{ A, B). M : Formula A B : Formula => A—^(B—^A) : Theorem, 
y{ A, B,C). M : Formula A B : Formula A C : Formula 
^ {A^B)^((A^(B^C))^{A^C)) -.Theorem, 

V{A, B). M : Formula A B : Formula A (A—^B ) : Theorem A M : Theorem 
=> B : Theorem 



Fig. 1. Membership equational axioms for minimal logic. 



k values in Ak, satisfy an equation t = t' iff a{t) = cr(t'), where we overload 
notation by identifying a with its unique homomorphic extension to li-terms. 
We write A,a \= t = t' to denote such a satisfaction. Similarly, A, a \= t:s holds 
iff a{t) e Ag. 

Note that an 17-algebra is nothing but a 7f-kinded first-order model with 
function symbols E and an alphabet of unary predicates {Sk}keK- Therefore, 
the satisfaction relation can be extended to Horn and first-order formulae (j) over 
these atomic formulae in the standard way. We write A\= <j) when the formula 
4> is satisfied for all valuations cr, and then say that H is a model of 4>. Similarly, 
a theory (17, E) in membership equational logic is simply a Horn theory for the 
associated signature, when 17 is viewed as first-order Tf-kinded signature. As 
usual, for (j) a first-order sentence in the language of 17, we write (17, E) \= (j) 
when all the models of the set E of sentences are also models of (j>- 

Theories in membership equational logic have initial models [29]. This pro- 
vides the basis for reasoning by induction, as is explained in detail in Section 4.1. 
We write (17, A) [ci ^ to denote that the initial model of the membership equa- 
tional logic theory (17, E) is also a model of (f>. Note that even though we restrict 
the axioms E to Horn clauses, we will employ first-order formulae 4> to formalize 
properties satisfied by the initial model, that is, inductive properties. 

3.3 The Maude System 

The Maude system [9,13] implements rewriting logic and has been designed with 
the explicit aims of supporting executable specification and reflective computa- 
tion. Theories are specified in Maude by modules, of which there are two kinds: 
functional modules and system modules. Maude’s functional modules are theories 
in membership equational logic. Equations in Maude’s functional modules are 
assumed to be Church-Rosser and terminating; they are executed by the Maude 
rewrite engine according to the rewriting techniques and operational semantics 
developed in [6]. Maude’s system modules are rewrite theories. The rules in a 
system module are not necessarily Church-Rosser or terminating. 

The semantics of a functional (respectively system) module is initial, i.e., such 
a module denotes the initial model in membership equational logic (respectively 
rewriting logic) of the theory thus specified. The syntax for functional modules 
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fmod MINIMAL is 

sorts SentConstant Formula Theorem . 

subsort SentConstant < Formula . 

subsort Theorem < Formula . 

op : Formula Formula -> Formula . 

vars ABC: Formula . 

mb A — >• (B — >• A) : Theorem . 

mb {A B) — >• ((j4 —¥(.B — >• O) iA — >• (7)) : Theorem . 
cmb B : Theorem if (^4 — >• B) : Theorem and A : Theorem . 
endfm 

Fig. 2. The module MINIMAL. 



is of the form fmod (L?, if) endfm, with (f2,E) a membership equational theory 
meeting the requirements mentioned above. Figure 2 gives an example of a func- 
tional module in Maude syntax, where IMgX symbols are used instead of ASCII 
characters to improve readability. Note that Maude’s syntax for functional mod- 
ules is syntactic sugar for introducing finite sets of membership axioms (Figure 2 
is just the sugared version of Figure 1), and we will use it from now on to present 
membership equational theories. In particular, (possibly conditional) equations 
and membership axioms in Maude are Horn clauses in membership equational 
logic; any operation declaration op f :si . . . s„ -> s corresponds to the Horn 
clause 

V(xi , . . . , . Xi . Si A ... A Xn . Sfi /*(xi , . . . , Xji) . s , 

where Xi is a variable of kind ki and Si G Sk^, for t G {1, . . . , n}; also, any subsort 
declaration subsort s < s' can be reduced to the sentence 

Va;. x:s x:s' , 

where a: is a variable of kind k and s, s' G Sk- Finally, kinds are not explicitly 
defined in Maude modules, but are instead inferred by the system as determined 
by the different connected components of the poset of sorts. 

As additional syntactic sugar, we shall often write Va; : s. as shorthand 
for the formula Va:. a;: s 4^{x), for x a variable of kind k and s G Sk- Moreover, 
for the formula a;: s say that “a; is of sort s (in ())).” 

4 Induction and Parameterization 

In this section we introduce two concepts that play key roles in the rest of the 
paper. We define an induction principle for membership equational theories and 
show how such theories can also be parameterized. We introduce these concepts 
in a simple setting that is adequate to illustrate the main ideas and carry out 
applications. 
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4.1 Induction Principles for Membership Equational Theories 

Given that membership equational logic is a subset of equational Horn logic 
(indeed, they can be shown to be equivalent [29]) it follows immediately that 
any theory (Q,E) has a unique (up to isomorphism) initial model [20]. The 
following is an induction principle for reasoning about properties of sorts, with 
respect to this model. 



Definition 2 (Induction over sort definitions) Let T = (17, E) be a theory 
in membership equational logic and let s be a sort in some Sk- Let C\t,s] = 
{Cl, . . . ,Cn} be those sentences in E that specify s, i.e., those Ci of the form 

V(xi , . . . , Xp ^ ). Hi A ... A A.q^ j4q , (2) 

where, for some t of kind k and s G Sk, Aq is t:s. 

For T a first- order formula with free variable x of sort s over the signature 
fi, an induction principle for {Q,E), with respect to a: : s and t{x) is the formula 



ifi A . . . A tpn ^ '^x:s.t{x) (3) 

where, for 1 <i <n and Ci of the form (2), ipi is 

V(a;i , . . . , Xp , ) . [Hi]t A . . . A [Hq^]x [Hq]^ (4) 



and, for 0 < j < q%, 




t(u) if Aj = u:s, for u of kind k 
Aj otherwise 



For a given membership equational theory (17, if), the above defines an induction 
schema (ind), given by (3), in many-kinded first-order logic over the signature 
17.^ Note that for qt = 0, the nullary conjunction in the antecedent of (4) is true 
and the implication can be replaced with the succedent r(f). 

In the initial model of a membership equational theory, sorts are interpreted 
as the smallest sets satisfying the axioms in the theory, and equality is inter- 
preted as the smallest congruence satisfying those axioms. Alternatively, the sets 
interpreting sorts can be characterized as being inductively generated in stages. 
This corresponds to the fixedpoint characterization of the least Herbrand model 
of a collection of Horn clauses [38], and the induction principle we have given 
formalizes induction over the stages in which the set is inductively defined [1] . By 
induction over the stages of the inductive definition of a sort s, which amounts 
to an induction over the proof that some ground term of kind k is of sort s, 
we can establish that reasoning in the membership equational theory (17, if), 
augmented by (ind), is sound. 

® This induction schema cannot be directly formalized in the sublogic membership 
equational logic, since it is not, in general, a sentence in membership equational 
logic. However, as we will later see, inference rules for (many-kinded) first-order 
theories — like this induction schema — can be encoded in rewriting logic and can be 
used to prove properties, at the metalevel, about membership equational theories. 
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Theorem 1 (Soundness) Let (J7, E) be a membership equational theory. If 
(J7, E U {{ind)}) h t, then {f2, E) \:^ r. 

As an example, consider the membership equational theory for minimal logic 
previously given in Figure 1. Definition 2 gives rise to the following induction 
principle over the sort Theorem: 

[V(A, _B). (A:Formula A _B:Formula t{A — — >A))) A 
V(A, iJjC'). (A:Formula A i?:Formula A C:Formula 

V(A, i?). (A: Formula A i? : Formula A r (A — >B) A t{A) ^ t{B)] 

VA : Theorem. r(A) 

This axiom formalizes induction over the structure of proofs in minimal logic. 

Note that other induction principles are possible. In particular (ind) takes 
all of the sentences that specify membership in s as constituting the inductive 
definition of s. In some cases, a subset of the sentences (sometimes called gen- 
erators or constructors) is sufficient to characterize an inductive definition. Of 
course additional proof obligations then arise, e.g., sufficient completeness of the 
chosen subset (see [6,10]). 



4.2 Parameterized Membership Equational Theories 

When carrying out metalogical reasoning, we often reason not about a fixed 
theory, but about a parameterized family of theories. There are many different 
ways in which a theory in membership equational logic may be parameterized. 
For the purposes of this paper it will be enough to consider the notion of a 
parameterized extension of a given theory T which, intuitively, describes the 
extensions of T by a parametric set of new axioms. 

Definition 3 Let T = (L2,E) be a theory in membership equational logic with 
L2 = (K,E,S). Then, a parameterized extension of T (by parameters V, and 
axioms G) is a membership equational theory Tg[1/] = (fi\V],E \J G), with 
fi\V] = (K,E U V,S), where the K-kinded signatures E and V are mutually 
disjoint and V consists only of constants. We call 7G[y] a parameterized mem- 
bership equational theory. 

Let (3 : V — > Tj; be a K-kinded function. Then, Ta[f3] = (G,E U (3(G)) 
denotes an instance of Tc[V], where (3(G) is the homomorphic extension of P 
to axioms. 

The substitution P is used to generate instances of a parameterized mem- 
bership equational theory. Namely, the new axioms G are instantiated so that 
all instances of variables in V are replaced by ground terms. For example, if 
G = {f(v) : s} and P(v) = q(a,b), then the parametric axiom f(v) : s is 
translated as P(f(v) : s) = f(q(a, b)) : s. The result is well-kinded under the 
(non-parameterized) signature L?. 
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fmod MINIMAL h[ A"] is 
sorts SentConstant Formula Theorem . 
subsort SentConstant < Formula . 
subsort Theorem < Formula . 
op A" : -> Formula . 

op : Formula Formula -> Formula . 

vars ABC: Formula . 
mb X : Theorem . 

mb A — >• {B — >• A) : Theorem . 

mb {A —^B) — >• ((A —^(.B — >• O) — >• (A — >• (7)) : Theorem . 

cmb B : Theorem if (A — >• B) : Theorem and A : Theorem . 

endfm 

Fig. 3. The module MINIMALH[Af]. 



Figure 3, provides an example of a parameterized module MINIMALh[A'], with 
X a parameter of the kind of the sort Formula, and S a set that contains only the 
parametric axiom A’:Theorem. We will later see how this parameterized module 
can be used to formalize the deduction theorem. 

5 Rewriting Logic as a Logical Framework 

As we have already said, from the logical point of view we can think of rewriting 
logic as a framework logic in which an inference system can be formalized by 
expressing each inference rule as a (possibly conditional) rewrite rule in a rewrite 
theory (17, E, R). Note that rewriting logic is noncommittal about the structure 
and properties of the formulae expressed by 17-terms. They are user-definable as 
an algebraic data type satisfying equational axioms, so that rewriting deduction 
takes place modulo such axioms. Because of this ecumenical neutrality and the 
simplicity of the rules of the logic, rewriting logic can be effectively applied as a 
logical framework. In [11,23,24,30,36,37] , many examples of logic representations 
are given, including first-order linear logic, sequent presentations of modal and 
propositional logics, Horn logic with equality, the lambda calculus, and higher- 
order pure type systems, among others. In all such examples, the representational 
distance between the object logic and its representation in rewriting logic is 
virtually zero, that is, the representations are direct and reasoning with them 
faithfully simulates reasoning in the original logics. 

In fact, there are several ways of conservatively representing a logic (with a 
finitary syntax and inference system) in rewriting logic. As mentioned before, 
a simple and direct way is to turn the inference rules into rewrite rules, which 
may be conditional if the inference rules have side conditions. Alternatively, we 
can use the underlying membership equational logic to represent theoremhood 
in a logic as a sort in a membership equational theory. Conditional membership 
axioms then directly support the representation of rules as schemas, which is 
typically used in presenting logics and formal systems. The module MINIMAL, 
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presented previously in Figure 2, represents minimal logic in membership equa- 
tional logic using this idea. A formula A is a theorem in minimal logic if and 
only if A is a term of sort Theorem in MINIMAL. Note that this representation 
preserves the inductive nature of the set of theorems and proofs in minimal logic. 

Similarly, we can represent theoremhood in a parameterized family of logics 
as a sort in a parameterized membership equational theory.^ As an example, 
consider the parameterized theory MINIMALs[A’] in Figure 3, with X a parameter 
of the kind of the sort Formula, and S a set that contains only the parametric 
axiom A;Theorem. This parameterized theory represents the family of logics that 
includes any extension of minimal logic with a new axiom in the following sense: 
a formula B is a, theorem in minimal logic extended with a new axiom A if and 
only if S is a term of sort Theorem in MINIMALh[A], where MINIMALs'[A] is the 
instance MINIMALh[/3] of MINIMALs[A], with !3{X) = A. 

The ability to represent parameterized families of logics is important for using 
rewriting logic also as a metalogical framework, and we will give an example of 
this in the experimental work reported on in Section 8. 

6 Reflection in Rewriting Logic and Mande 

In this section we explain how rewriting logic is reflective and how the Maude sys- 
tem implements reflective rewriting logic deduction. We also introduce a Boolean 
function that reflects at the metalevel the membership relation in membership 
equational logic, which will be used in later sections. Finally, we explain how 
to combine the use of rewriting logic as a logical framework and the reflective 
capabilities of Maude to build a theorem prover for carrying out inductive proofs. 

6.1 Reflection in Rewriting Logic 

Rewriting logic is reflectiv e [7,15 ,16]. There is a universal theory UNIVERSAL, and 
a representation function (_ h _) encoding pairs consisting of a rewrite theory T 
and a sentence in it as sentences in UNIVERSAL. For any finitely presented rewrite 
theory T (including UNIVERSAL itself) and any terms t, t' in T, the representation 
function is defined by 

T F t — = (T, t) — > (T, t ) , 

where T, t, t are terms in UNIVERSAL. Then, the equivalence (1) in Section 2 
holds for rewriting logic (as proved in [7,15,16]) and takes the form 

T'rt >t' iff UNIVERSAL h (T, t) — > (T, t) . 

^ A sort in a parameterized membership equational theory can be used to represent 
theoremhood in a family of logics if and only if there is a one-to-one correspon- 
dence between logics in the family and instances of the parameterized membership 
equational logic theory, and this correspondence is such that theoremhood in a logic 
in the family can be represented as membership in this sort in the corresponding 
instance of the parameterized membership equational logic. 
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6.2 Reflection in Maude 

Maude’s language design and implementation make systematic use of the fact 
that rewriting logic is reflective to give the user a well-defined gateway to 
the metatheory of rewriting logic. This entry point is the predefined moduie 
META-LEVEL, which is a (partiai) specification in Maude of UNIVERSAL [8]. In the 
moduie META-LEVEL, a Maude term t is reified as an element t of a data type 
Term of terms, and a Maude module T, i.e., a membership equational theory, 
is reified as a ground term T in a data type Module of modules. See [8] for a 
complete definition of the moduie META-LEVEL and of the metarepresentation 
map for theories and terms. 

The metarepresentation of a parameterized membership equational theory 
T( 3 [y] is simiiar to that of an unparameterized theory T, except that parameters 
are treated in a speciai way. The parametric nature of 7 g[M] is expressed in its 
metarepresentation Tc[y] by the fact that each parameter u G M is represented 
by a (meta-) variable v of sort Term. 

Note that since, in Definition 3, a parameterized theory Tc[y] is technically 
defined as an ordinary membership equational theory (plus some extra informa- 
tion), one could metarepresent ?G[y] as an ordinary theory, and then one would 
get a ground term of sort Module, instead of a term with variables. Therefore, 
our notation is potentially confusing, since it depends on whether is 

metarepresented as a parameterized entity or as an unparameterized one. Rather 
than introducing new notation, we have chosen to solve the possible ambigui- 
ties by the context in which they occur. In particular, having introduced 7 g[M] 
as a parameterized membership equational theory, Tg[R] will always denote 
the metarepresentation of Tg[M] as a parameterized entity. The same rule ap- 
plies when metarepresenting parameterized terms in parameterized membership 
equational theories. 

To reason about metarepresented (parameterized) theories we have defined, 
in an extension META-IND of META-LEVEL, a Boolean function (_:_in_) that re- 
flects at the metalevel the membership relation in membership equational logic. 
(The theory is so named because it is in this theory that we will prove inductive 
metatheorems.) In particular, (t:s in T) checks, at the metalevel, whether the 
ground term t has the sort s in the functional module T. Specifically, this check 
of membership is based on the equivalence 

T\-t:s META-IND h tis in T = true. 

In [2] we give the full specification of (_:_in_) for a restricted subclass C of 
modules. Members of C are modules that correspond, basically, to membership 
equational theories whose axioms are Horn clauses that only involve membership 
assertions (no equations). The Boolean function (_:_in_) will play a key role in 
the rest of this paper. 

In what follows, C will always denote the above mentioned subclass of mod- 
ules. In addition, for any parameterized membership equational theory Tg[R], 
we write Tg[R] G C iff for any instance Ta[j3] of Tg[V], Tg[/3] G C. We also write 
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V for the set of variables of sort Term that metarepresent the parameters v & V 
in Tg[V\. 



6.3 Building an Inductive Theorem Prover 

In Section 4.1 we have presented induction principles for reasoning about first- 
order formulae over sorts defined in functional modules, i.e., membership equa- 
tional theories. Here we explain how to combine the use of rewriting logic as a 
logical framework and the reflective capabilities of Maude to build a theorem 
prover for carrying out inductive proofs. The paper [10] provides further details 
on building theorem proving tools in Maude. 

To build an inductive theorem prover, we use rewriting logic to specify its 
inference system (as explained in Section 5) and reflection to define strategies 
that control rule application. Strategies are needed here since inference rules will 
be specified as rewrite rules that are not necessarily Church-Rosser or terminat- 
ing. Hence, it is important to have some way of controlling the application of 
these rewrite rules in order to drive rewriting in some desired direction. Maude 
users can control rewriting by specifying, at the metalevel, their own rewriting 
strategies. (See [7,9] for more details on defining strategies in Maude.) 

Our inductive theorem prover — the ITP tool® — has a reflective design. The 
functional module T, about which we want to prove inductive theorems, is at 
the object level. An inference system X for inductive proofs uses T as data 
and therefore is specified as a system module ITP-RULES at the metalevel. In 
particular, ITP-RULES encodes syntax and proof rules for first-order logic as well 
as the induction over sort definitions introduced in Definition 2. Finally, different 
proof tactics to guide the application of the rewrite rules specifying the inference 
rules in X are strategies, which are defined at the meta-metalevel in a module 
ITP-TACTICS. 

Operationally, to use the ITP tool, the user submits as an initial goal the pair 
formed by (the metarepresentation) of a functional module and (the representa- 
tion of) the first-order sentence over its signature that is to be proved, and then 
this goal is successively transformed by rewriting — using the inference rules as 
rewrite rules — into different sets of subgoals, until (in the case of a successful 
proof) no subgoals are left. The application of the inference rules as rewrite rules 
is controlled by the user using strategies. 

Finally, note that building the theorem prover using different levels of re- 
flection results in a modular design with a clean separation between the logical 
and the control components. For example, we can simply extend the tool by 
specifying additional inference rules in the module ITP-RULES without changing 
the strategy language defined in the module ITP-TACTICS and vice versa. 



http://sophia.unav.es/~clavel contains the most recent version of this tool. 
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7 Rewriting Logic as a Metalogical Framework 

We now show how the induction principles introduced in Section 4.1 for reason- 
ing in membership equational theories can be uniformly reflected for reasoning, 
at the metalevel, about membership equational theories and their parameter- 
ized extensions. The results presented in this section provide the basis for using 
rewriting logic as a metalogical framework. 

7.1 Inductive Theorems versus Inductive Metatheorems 

The induction principle presented in Section 4.1 is well-suited for proving prop- 
erties of ground terms of sort s in a given membership equational theory (f?, if), 
with f2 = {K,S,S), and s a sort in some Sk^ when these properties are ex- 
pressible as first-order formulae over the signature f2; when this is the case, a 
property P holds if the first-order formula that expresses P holds in the initial 
model of {Q,E). Of course, there are many interesting properties satisfied by 
ground terms of sort s that cannot be expressed as first-order formulae over 
the signature 17, despite the fact that they are typically proved by induction 
over the definition of the sort s, e.g., properties that relate different membership 
equational theories. Many of these properties can be naturally expressed, at the 
metalevel, as first-order formulae over the signature of META-IND. Consider, for 
example, the following property: let T = G C and T' = (17', if') G C he 

membership equational theories; then the property 

iff is a ground term of sort s in T, 

theu t is also a ground term of sort s' in T' 

is not expressible as a first-order formula over either T or T' . Notice that, 
using the Boolean function (_: _in_), we can express this as a first-order formula 
at the metalevel, namely, 

Va; : Term. {x:'s in T = true ^ x: s' in T' = true) . (5) 

The situation is similar when proving properties of ground terms of sort s for 
all instances Tc[P] of a parameterized theory Tg[F] GC. Consider the following 
generalization of the above statement. Let 7 g[M] G C he & parameterized exten- 
sion of T = (17, if), with V = {ui, . . . , Vn}, and ki the kind of the parameter Vi. 
Then we might formalize that for all instances Ta[f3] of 7 g[M] 

iff is a ground term of sort s in Tg[I3\, 

theu t is a ground term of sort s' in T' . 

Again, we cannot express this property as a first-order formula over the signature 
of any particular instance of Tg[C]. Notice, however, that, using the Boolean 
function (_:_in_), we can formalize this as 

V(y : Term, x : Term). ((!J7:si in T = true A ... A uii'.Sn in T = true) 

(x:s in Tg[TA] = true x:s' in T' = true)) , (6) 

where, for each parameter Vi G V, Si is a sort in 5'^ . . 
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We claim that instances of both (5) and (6) can be proved by induction 
in a way that mirrors their expected inductive proofs (we will show this for a 
particular example in Section 8). To see this, the crucial observation — allowing 
us to mirror inductive reasoning over a sort s in a theory T (or in a parameterized 
extension Tg[T 7]) by inductive reasoning over META-IND — is the following. Let s 
be a sort in T defined by a set of Horn clauses {Ci, . . . ,Cn}- By the definition 
of the Boolean function (_:_in_), the set of ground terms u of sort Term such 
that 

u:s in T ~ true (7) 

is precisely the set of terms of the form u = t, for t a ground term of sort s 
in T, and can be defined inductively by a set of Horn clauses {Ci,... ,C„} 
that reflect, at the metalevel, the set of Horn clauses {Ci, . . . , Cn}- The idea is 
that we can then use {Ci, . . . ,C'„} to derive an induction rule (ind) to prove 
metaproperties about the ground terms of sort s in T, in exactly the same way 
as we obtained the induction rule (ind) from {Ci, . . . , Cn} in Definition 2. Since 
each Ci mirrors at the metalevel the corresponding Ci, inductive metareasoning 
with (ind) also mirrors inductive reasoning with (ind). Notice that, when dealing 
with parameterized extensions, the resulting induction rule (ind) will have to be 
universally quantified over the variables representing the parameters. 

7.2 Metalevel Inference Rules for Parameterized Theories 

We now formalize the above intuitions and introduce a new inference rule for 
proving a broad class of metatheorems about parameterized membership equa- 
tional theories. These metatheorems correspond to inductive properties of the 
initial model of the module META-IND of the general form 

V(y : Term, a; : Term). ((iJi:si in T = true A ... A Vm'^m in T = true) 

((a;:s in = true A <?) (j))) , 

where (j) and <I> range over first-order formulae over the signature of the module 
META-IND. The formulae (5) and (6) above are instances of this general form. In 
the case of (5), the nonparameterized theory T constitutes a trivial parameter- 
ized extension Tg [0] of itself. 

The soundness of the inference rule that we introduce is based on the fact 
that, for any instance Tg\P] of Tc[y], the set of terms of sort Term that metarep- 
resent terms of sort s in Tg[/ 3] is inductively defined. In essence, the new inference 
rule reflects at the metalevel the induction principle defined in Definition 2 for 
reasoning over the terms of sort s in any instance of 

First, we define for any parameterized extension Tg[H] = {fI\V],E U G) G 
C, with = {K,S U H, 5), and any sort s in some Sk, a set of clauses 

C'[Tg[v],s]> that mirrors, at the metalevel, the set of Horn clauses C\tg[v],s] that 
inductively define the terms of sort s in (Recall that parameters in V are 

metarepresented as variables of sort Term.) Then, we define an inference rule for 
proving certain metatheorems about Tg[R]. 
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Definition 4 LetTG\V\ = {f^[V], EUG) gC be a parameterized extension, with 
fi\V] = {K, E UV, S), and let sq be a sort in some Sk, such that C[Tc[v],so] = 
{Cl, . . . , Cn} is the set of sentences of the form 

V(xi , . . . , Xp, ).tl.5lA...A tq, . Sq^ to ■ '^0 

that specify sq in Tg[T7]. 

We define C[Tg[v],so] = (Ci, . . . , C„}, where, for 1 < z < n, Cj zs 

V(a;i, . • . , Xp^y 

ti:si in Tg[T 7] = true A . . . Atq,:Sq, in Tg[V^] = true 
^to'.so in Tc[y]=true, 

where {x\, . . . , Xpf\ are variables of the kind of the sort Term, and, for 0 < i < qi, 
ti is the metarepresentation of the term U, except that any variable x in U is not 
metarepresented but, instead, it is replaced by a ( meta-) variable x of the kind 
of the sort Term. Note that, in general, some clauses in may contain 

free variables. 



Definition 5 LetTalV] = {f2[V], EUG) gC be a parameterized extension, with 
fi\V] — {K, U y, S') and V = (ui, . . . , Vm}- Let sq be a sort in some Sk and 
let G[Tg[v],so] = (C*!, • ■ • , Gn} be those sentences of the form 

V(xi , . . . , Xp, y t\ . Sl A . . . A tq, . Sq,^ to ■ '^0 j (S) 

that specify sq in Tg[C]. Finally, let r be a first-order formula, with free variable 
X of the kind of the sort Term, of the form 



Vy : Term. ((zJT: Zi in T = true A ... A v^:Zm in T = true) 

{{x:so in Tg[ 1^] = true A <P) ^ (/>)) , 

where, for each parameter Vi G V , Zi G Ski, fe>x ki the kind of Vi. 

An inductive inference rule for META-IND, with respect to x and t{x) is the 
formula 

(Vy : Term. (zJTrti in T = true A ... A v^:Zm in T = true) 

[Cl], A ... A [C„],) (9) 

Vx : Term. t{x) , 

where, for each Gi of the form (8), Ci is defined as in Definition 4, and [Ci], is 
the formula 



V(ti, . . . , Xpif 

[ti :si in Tg[V] = tru e], A ... A [t^, in Tg[V] = true], 
^ [to- So in Tg]!^] = true], 

where, for 0 < j < Qi, 



[ij :Sj in Tclh^] = true], = 



f(4> 

\ij-- 






Si in 



7g[ 1^] = true 



The soundness of the inference rule (9) is proved in [2]. 



if Sj = So 
otherwise. 




Rewriting Logic as a Metalogical Framework 



73 



7.3 Building a Inductive Metatheorem Prover 

In Section 6.3 we indicated how it is possible to use reflection in Maude to 
design modular, extensible, theorem proving tools. In particular, we explained 
the reflective design of the IIP tool and how we implemented the inference 
rule (3) for induction over sort definitions. For carrying out formal metatheory, 
we have also extended the IIP tool with the inductive inference rule (9). It is 
this extended version of the tool that we have used in the experimental work 
reported on in the next section. 

8 An Example 

In this section we give an example that illustrates how rewriting logic can be 
used as a reflective metalogical framework. Our example is a standard one in 
metareasoning, namely, the deduction theorem. 

8.1 The Deduction Theorem for Minimal Logic 

We present here the deduction theorem for minimal logic of implication. This 
theorem is interesting for several reasons. To begin with, it is a central metathe- 
orem that holds for Hilbert systems for many logics and justifies proof under 
temporary assumption in the manner of a natural deduction system. Moreover, 
although relatively simple, it illustrates some subtle aspects of formal metarea- 
soning. For example, it is actually a metatheorem not about a particular deduc- 
tion system, but rather a metatheorem that relates different deduction systems: 
one in which H ^ H is proved, and a second (which is obtained from the first by 
adding the axiom A) in which B is proved. Indeed, since A is an arbitrary for- 
mula, the standard statement of the deduction theorem is actually a statement 
about the relationship between a family of pairs of deduction systems. 

For A and B formulae, we write \~m A to denote that H is a theorem in 
minimal logic, and \~m[A] B to denote that if minimal logic is extended with 
the additional axiom A, then B belongs to the resulting set of theorems. The 
deduction theorem then states that, for any formulae A and B in minimal logic, 

if 'tm[A]B then \-mA-^B. (10) 

This metatheorem is proven by induction on the structure of derivations in 
minimal logic extended with the axiom A. 



Formalization. Consider now the representation of minimal logic in rewrit- 
ing logic provided by MINIMAL in Figure 2, and its parameterized extension 
MINIMALs[T] introduced in Figure 3. Recall that MINIMAL represents minimal 
logic in the sense that a formula H is a theorem in minimal logic if and only if 
H is a term of sort Theorem. We can rephrase the deduction theorem as follows: 
for any formulae A and B, if R is a term of sort Theorem in MINIMALh[H], then 
A^B is a term of sort Theorem in MINIMAL. 
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Notice that this theorem states an implication between the truth of two 
membership assertions over two different membership equational theories (in 
fact, a whole family of such pairs, since is a parameter). Hence, to formalize 
the deduction theorem, we must move up a level, to META-IND. We claim that 
the following formula formalizes the deduction theorem as a metatheorem about 
the initial model of META-IND: 

V(A’, B) : Term. 

((df:Formula in MINIMAL = true A HrFormuIa in MINIMAL 
(H: Theorem in MINIMALs[df] = true 

(X — >i?):Theorem in MINIMAL = true)) , 

where in the term denoted by X^B the (meta-) variables X and B of sort Term 
are not metarepresented as if they were object level variables, but are instead 
preserved as (meta-) variables. From now on, we will follow the same convention 
for terms of this kind, i.e., terms that include elements of sort Term in META-IND. 

When performing metareasoning we must reason about terms being well- 
sorted with respect to particular theories, in this case, the membership equa- 
tional theory MINIMAL. For this reason, we have explicitly assumed in our for- 
malization of the deduction theorem the well-typedness of X and B. Of course, 
standard textbook proofs also require this, but such well-formedness details are 
usually glossed over as trivial. The correctness of the formalization then follows 
from the definition of the Boolean function (_:_in_) and the fact that MINIMAL 
is a conservative representation of minimal logic. 

Note, incidentally, that the requirement that H is a formula in minimal logic 
is actually superfluous and can be dropped (we will do so for the proof below) . 
It is provable (again by induction) that any theorem in any extension of minimal 
logic with a new axiom is also a formula in minimal logic. 

Proof of the Deduction Theorem. We show here how we prove (11). Note 
that our proof mirrors the standard proof of the deduction theorem. 

To prove (11) in META-IND we apply the reflected version of the induction 
principle for the sort Theorem in the parameterized extension MINIMALh[A’], that 
is, the corresponding instance of the inference rule (9). This reduces proving (11) 
to proving the formula given in Figure 4. Notice that the four conjuncts corre- 
spond to the cases involved in proving the deduction theorem by induction over 
the proof that H is a theorem in minimal logic extended with the axiom A. The 
first formalizes the case when B is X. The next two conjuncts formalize the cases 
where B is either an instance of the K or S axiom schemata. The final conjunct 
formalizes the case of B being proved by an instance of modus ponens. 

By the theorem of constants for membership equational logic [29], we can 
reduce proving this formula to proving the four conjuncts that result from re- 
placing the variable T by a new constant symbol x of sort Term, under the 
assumption that 



( 11 ) 



= true) 



x: Formula in MINIMAL = true. 



( 12 ) 
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[(TL— :Theorem in MINIMAL = true 

A 

(V(A, _B). (^ : Formula in MINIMALh [ 7L] = true A 
_B:Formula in MINIMAL^fA"] = true) 

=> (TL— >-yl))) :Theorem in MINIMAL = true) 

A 

(V(yl, _B, C). (j 4 : Formula in MINIMALsfTL] = true A 
_B:Formula in MINIMALsfA"] = true A 
C:Formula in MINIMAlsfA"] = true) 

=> (A’->-((4->-R)->-((^->-(R->-C))->-(yl->-C)))): Theorem in MINIMAL 
= true) 

A 

(V(A, _B). (^ : Formula in MINIMALh [A"] = true A 
_B:Formula in MINIMALsfA’] = true A 
A"— >-_B) :Theorem in MINIMAL = true A 
A"— >• A: Theorem in MINIMAL = true) 

=> (A"— >-_B) :Theorem in MINIMAL = true)]] 



Fig. 4. Goal resulting after induction 



The proof of each of the resulting conjuncts mirrors the proof of the corre- 
sponding case in the standard inductive proof of the deduction theorem. In 
what follows, MINIMALs[a;] denotes the term of sort Module that results from 
MINIMALsjT] by replacing the free variable X of sort Term by the new constant 
symbol x. Consider, for example, how we prove the third conjunct: 



y{A,B,C). (13) 

A: Formula in MlNlMAL 2 [a;] = true A 
B:Formula in MlNlMALs[a;] = true A 
C:Formula in MlNlMALs[a:] = true 

(x— >((A— >5)— >((A— >(i3— >C))— >(A— >(7))) :Theorem 
in MINIMAL = true) . 



Using the theorem of constants again, we can reduce proving (13) to proving 



a;— >-((a— >6)— >((a— >(6— >c))— >(a— >c))) : Theorem 



in MINIMAL = true , (14) 
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under the assumptions that a, b, c are new constants of sort Term such that 
a: Formula in MINIMALs[a:] = true A 

5:Formula in MINIMALs[a:] = true A (15) 

c: Formula in MlNlMALs[a;] = true. 

Note that, from the assumptions (12) and (15), by using the fact (which must 
be proven separately) that any theorem in any extension of minimal logic with a 
new axiom is a well-formed formula in minimal logic, we can derive the formula 

a: Formula in MINIMAL = true A 
6:Formula in MINIMAL = true A 
c:Formula in MINIMAL = true A 
a;: Formula in MINIMAL = true . 

Finally, we prove (14) using the equations in META-IND and (16). This proof 
mirrors the proof that, for any formulae X , A, B, and C, 

(17) 

is a theorem in minimal logic; in particular, this proof mirrors proving (17) by 
modus ponens, using the following instance of the S axiom 

^A^B)^{{A^{B^C))^{A^C)) 

and the following instance of the K axiom 

[{A^B)^{{A^{B^C))^{A^C))] 

^ {X^[{A^B)^{{A^{B^C))^{A^Cm . 



8.2 Other Examples and Experience 

We have used rewriting logic as a reflective metalogical framework to carry out 
a number of other proofs in formal metatheory based on more sophisticated 
versions of the deduction theorem for minimal logic. In particular, we have 
proved results similar to those of Basin and Matthews [4,5], who have shown 
how metatheorems that are parameterized by their scope of application can be 
proved using a theory of parameterized inductive definitions as a metatheory. For 
example, they present a generalized version of the deduction theorem that can 
be applied to all extensions of the language and axioms of minimal logic. From 
their theorem it follows that the deduction theorem holds for the minimal logic 
of implication and for any propositional extension of it, but not necessarily for 
extensions to modal logics (which would require adding new rules, as opposed to 
new axioms). Although rewriting logic is based on a rather different foundation 
than those considered in [5], our representation of the object logic is quite simi- 
lar and — abstracting away from the details involved in moving between levels of 
representation — the basic structure of the proofs is also similar. 
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One promising area to apply our results is program transformation and 
metaprogramming. From a reflective declarative point of view, programs that 
transform other programs are first-order functions acting on terms that metarep- 
resent theories, and the properties that they satisfy are metatheorems, as they 
are understood in this paper. This reflective declarative methodology has been 
used in [12] to specify polytypic programs like map and cata in Maude. Ac- 
cordingly, polytypic programs are specified as metalevel functions that add to a 
module the equations defining the desired object function by structural induction 
over the sort definitions. Properties of polytypic programs, like the functoriality 
of map, are then metatheorems that can be proved, as it is showed in [12], using 
the corresponding induction rule (Definition 5). 

Here we would also like to comment on our experience in proving these the- 
orems and on the issue of managing proofs that combine reasoning at differ- 
ent levels. To the working logician or computer scientist, reflective metalogical 
frameworks may seem complicated and not particularly user-friendly since there 
is quite a bit of encoding involved in stating a metatheorem and in carrying out 
its proof. In particular, reasoning can involve three or more levels (object, meta, 
meta-meta, ...).® 

In our case, we have been able to avoid many of the practical problems of 
working with a reflective hierarchy by exploiting the reflective capabilities of 
Maude to build tools and suitable interfaces that hide levels of reflection. As 
part of our work, we have built an interface — fully specified in Maude — to in- 
teract with the ITP inductive theorem prover described in Sections 6.3 and 7.3. 
As already explained, ITP automatically extracts from a theory the induction 
principles for reasoning over its sorts (Definition 2), and (in its metaprover exten- 
sion) the induction rules that correspond to reflecting those induction principles 
at the metalevel when the task at hand is to prove a metatheorem (Definition 5) . 
Proving an inductive theorem then amounts to computing a strategy at the meta- 
metalevel, or at the meta- meta-metalevel if the theorem is, as in the case of (11), 
a metatheorem about the initial model of META-IND. Fortunately, the interface 
we use hides all these levels of encoding from the user. Hence the user can ac- 
tually abstract away many of the metarepresentation details and focus on the 
essential structure of proofs of theorems. 

9 Conclusion 

We have presented, both abstractly and concretely, a new approach to metathe- 
oretic reasoning based on using reflective logical frameworks whose theories have 
initial models. Initial experiments demonstrate that the machinery for reflective 
deduction in membership equational logic provides a rich foundation for formal- 
izing and proving metatheorems. Our experiments show, for example, that one 

® Although note that reasoning about a logic encoded as an inductive definition in 
a logical framework like Isabelle also involves multiple levels, e.g., the framework’s 
metalogic, the theory of inductive definitions, and the object logic. Moreover, there 
is often an additional language for writing tactics. 
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can prove metatheorems similar to those provable in logical frameworks based on 
parameterized inductive definitions, and that one has considerable flexibility in 
moving between theories and proving theorems that relate theories or establish 
properties of parameterized classes of theories. In essence, we can do this because 
the requirements that such metatheorems pose on the metatheory — namely, that 
one can build families of sets using parameterized inductive definitions and that 
one can reason about their elements by induction — are realizable in membership 
equational logic using reflection. 

There are a number of directions for further work. One concerns generaliz- 
ing our notion of a parameterized theory. Currently we can reason at the met- 
alevel about families of theories that are parameterized by sets of new constants 
and new axioms, which may make use of the new constants. For proving other 
metatheorems it would be useful to develop a more general theory representation 
calculus where one could reason at the metalevel about families of theories that 
are parameterized by arbitrary sets of new sorts, operators, and axioms. In par- 
ticular, this would allow us to prove metatheorems involving the more general 
parameterized modules of Full Maude [9,18]. 

Also, our example illustrates how it is possible to carry out proofs similar to 
those possible in stronger framework logics. However, it would be interesting to 
have a more formal comparison of the relative strengths of membership equa- 
tional logic with reflection versus stronger metalogics like higher-order logic or 
set theory. Finally, related to this is the question of how easy it is to reflect induc- 
tion principles other than structural induction, e.g., induction over an arbitrary, 
user-definable well-founded order. 
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Wireless Communication and Frequencies 

Wireless communication technology is the basis of radio and television broadcasting, it 
is used in satellite -based cellular telephone systems, in point-to-multipoint radio access 
systems, and in terrestrial mobile cellular networks, to mention a few such systems (see 
[5] for more detailed information). 

Wireless communication networks employ radio frequencies to establish communica- 
tion links. The available radio spectrum is very limited. To meet today’s radio commu- 
nication demand, this resource has to be administered and reused carefully in order to 
control mutual interference. The reuse can be organized via separation in space, time, or 
frequency, for example. The problem, therefore, arises to distribute frequencies to links 
in a “reasonable manner”. This is the basic form of the frequency assignment problem. 
What “reasonable” means, how to quantify this measure of quality, which technical side 
constraints to consider cannot be answered in general. The exact specification of this 
task and its mathematical model depend heavily on the particular application consid- 
ered. 

Mobile Cellular Networks, GSM 

I will concentrate here on terrestrial mobile cellular networks, an application that has 
revolutionized the telephone business in the recent years and is going to have further 
significant impact in the years to come. Even in this special application the frequency as- 
signment problem has no universal mathematical model. I will focus on the GSM stan- 
dard (GSM stands for “General System for Mobile Communication”), which has been 
in use since 1992. GSM is the basis of almost all cellular phone networks in Europe. 
It is employed in more than 100 countries serving several hundred million customers. 
The new worldwide standard UMTS (Universal Mobile Telecommunication System) 
is expected to become commercially available around 2002. It is frequently covered in 
the public press at present because of the enormous amounts of money telephone com- 
panies are paying in the national frequency auctions. UMTS handles frequency reuse 
in an even more intricate manner than GMS: frequency or time division are used in 
combination with code division multiple access (CDMA) technology. 

Channel Spectrum 

The typical situation in GSM frequency planning is as follows. A telephone company 
(let us call it the operator) has bought the right to use a certain spectrum of frequencies 
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[fmin, fmax] in a particular geographical region, e.g., a country. The frequency band is 
- depending on the technology utilized - partitioned into a set of channels, all with the 
same bandwidth A. The available channels are usually denoted by 1, 2, . . . ,N, where 
N = {fmax — fmin)IA. In Germany, for instance, an operator of a mobile phone 
network owns about 100 channels. On each channel available, one can communicate 
information from a transmitter to a receiver. For bidirectional traffic a second channel 
is needed. In fact, if an operator buys a spectrum [fmin, fmax] he automatically obtains 
a paired spectrum of equal width for bidirectional communication. One of these spectra 
is used for mobile to base station (up-link), the other for base station to mobile (down- 
link) communication. 



BTSs, TRXs, and Cells 

To serve his customers an operator has to solve a number of nontrivial problems. In an 
intitial step the geographical distribution of the communication demand for the plan- 
ning period is estimated. Based on these figures, a communication infrastructure has 
to be installed capable to serve the anticipated demand. The devices handling the radio 
communication with the mobile phones of the customers are called Base Transceiver 
Stations (BTS). They have radio transmission and reception equipment, including an- 
tennas and all necessary signal processing capabilities. An antenna of a BTS can be 
omni-directional or sectorized. The typical BTS used today operates three antennas 
each with an opening angle of 120 degrees. Each such antenna defines a cell. These 
cells are the basic planning units (and that is why mobile phone systems are also called 
cellular phone systems). 

The capacity of a cell is defined by the number of transmitter/receiver units, called 
TRXs, installed for this antenna. The first TRX handles the signalling and offers capac- 
ity for up to six calls (by time division). Additional TRXs can typically handle 7 or 8 
further calls - depending on the extra signalling load. No more than 12 TRXs can be 
installed for one antenna, i.e., the maximum capacity of a cell is in the range of 80 calls. 
That is why areas of heavy traffic (e.g., airports, business centers of big cities) have to 
be subdivided into many cells. 



BSCs, MSCs, and the Core Network 

In a next planning step, the operator has to locate and install the so called Base Station 
Controllers (BSCs). Each BTS has to be connected (in general via cable) to such a 
BSC, while a BSC operates several BTSs in parallel. A BSC is, e.g., in charge of the 
management of hand-overs. 

Every BSC, in turn, is connected to a Mobile Service Switching Center (MSC). The 
MSCs are connected to each other through the so called core network, which has to 
carry the “backbone traffic”. The location planning for BSCs and MSCs, the design of 
the topology of the core network, the optimization of the link capacities, routing, fail- 
ure handling, etc., constitute major tasks an operator has to address. We do not intend 
to discuss here the roles of all the devices that make up a mobile phone network and 
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their mutual interplay in detail. This brief sketch is just meant to indicate that telecom- 
munication network planning is quite a complex task. 



Channel Assignment, Hand-Over 



We have seen that the TRXs are the devices that handle radio communication with the 
mobile phones of the customers. The operators in Germany maintain networks of about 
5,000 to 15,000 TRXs and have around 100 channels available. Thus, the question arises 
how to best distribute the channels to the TRXs. 

An operational mobile phone emits signals that allow the network to roughly keep track 
of where the mobile phone is currently located. This is done via so called control chan- 
nels. Whenever a communication demand arises, the system decides which TRX is 
going to handle the communication. This decision is based on the signal strengths of 
the various TRXs that are able to communicate with the phone as well as on the current 
traffic. The mobile phone is tuned to the channel of the TRX that presently appears to 
serve the phone best. If the phone moves (e.g., in a car) the communication with its cur- 
rent TRX may become poor. The system monitors the reception quality and may decide 
to use a TRX from another cell. Such a switch is called hand-over. 

This short discussion shows that a mobile phone typically is not only in one cell. In fact, 
some cells must overlap, otherwise hand-overs are not possible. 



Interference 

Whenever two cells overlap and use the same channel, interference (signal-to-noise 
ratio at the receiving end of a connection) occurs in the area of cell intersection. More- 
over, antennas may cause interference far beyond their cell limits. The computation of 
the level of interference is a difficult task. It depends not only on the channels, the sig- 
nals’ strength and direction, but also on the shape of the environment, which strongly 
influences wave propagation. There are a number of theoretical methods and formulas 
with which interference can be quantified. Most mobile phone companies base their 
analysis of interference on some mathematical model taking transmitter power, dis- 
tances, fading and filtering factors into account. The data for these models typically 
come from terrain and building data bases but may also include vegetation data. They 
combine this with practical experience and extensive measurements. The result is an 
interference prediction model with which the so called co-channel interference that oc- 
curs when two TRXs transmit on the same channel is quantified. There may also be 
adjacent-channel interference when two TRXs operate on channels that are adjacent 
(i.e., one TRX operates on channel i, the other on channel i -F 1 or z — 1). 

Reality is a bit more complicated than sketched before. Several TRXs (and not only 
two) operating on the same or adjacent channels may interfere with each other at the 
same time. And what really is the interference between two cells? It may be that two 
cells interfere only in 10% of their area but with high noise or that they interfere in 
50% of their area with low noise. What if interference is high but almost no traffic is 
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expected? How can a single “interference value” reflect such a difference in the inter- 
ference behaviour? There is no clear answer. 

The planners have to investigate such cases in detail and have to come up with a rea- 
sonable compromise. The result, in general, is a number, the interference value, which 
is usually normalized to be between 0 and 1. This number should - to the best of the 
knowledge of the planners - characterize the interference between two TRXs (in terms 
of the model, the technological assumptions, etc., used by the operator). 



Separation and Blocked Channels 

There are also hard constraints. If two or more TRXs are installed at the same location 
(or site), there are restrictions on how close their channels may be. For instance, if a 
TRX operates on channel i, a TRX at the same site is not allowed to operate on channels 
i + Such a restriction is called co-site separation. Separation requirements may 

even be tighter if two TRXs are not only co-site, but also serve the same cell. Separation 
requirements may apply also to TRXs that are in close proximity. 

The situation is even more complex. Due to government regulations, agreements with 
operators in neighbouring regions, requirements from military forces, etc., an operator 
may not be allowed to use its whole spectrum of channels at every location. This means 
that, for each TRX, there may be a set of so called blocked channels. 



Interference Graph 

A feasible assignment of channels to TRXs clearly has to satisfy all separation con- 
straints. Blocked channels must not be used. What should one do about interference? 
On our way to an adequate mathematical representation of all technical constraints let 
us first introduce the interference graph G = (V,E). G has a node for every TRX, two 
nodes are joined by an edge, if interference occurs when the associated TRXs operate 
on the same channel or on adjacent channels or if a separation constraint applies to 
the two TRXs. With each edge vw e E, two interference values, denoted by c’^°{vw) 
and c°''^{vw), are associated; the number c‘^°{vw) is the co-channel interference that 
occurs when TRXs v and w operate on the same channel while c°'’^{vw) denotes the 
interference value coming up when v and w operate on adjacent channels. In general, 
c‘^°{vw) > c°''^{vw). If a separation constraint applies to v and w then a suitable large 
number is allocated to d^°{vw) and c°''^{vw). 



Two “Natural” Approaches 

A first attempt to solve the frequency assignment problem is obvious. We try to And a 
spectrum, i.e., a number of channels 1, . . . ,N such that the N available channels can 
be assigned to TRXs so that no interference occurs. 

No interference is, of course, a good aim, but this task is unrealistic for several reasons. 
A mobile phone network is a “living system”, i.e., new BTSs, antennas, etc., are in- 
stalled regularly, old ones are replaced by new ones with different characteristics. It is 
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impossible to change the spectrum each time the network changes. Moreover, the num- 
ber of channels may be fixed or channels may only be available in bundles (i.e., one may 
buy 75, 100 or 125, but nothing else). Frequencies are expensive and cost reasons may 
require that some interference is tolerated. In fact, some interference may be unavoid- 
able. We have data of mobile phone systems where the largest clique in the interference 
graph is about twice as large as the number of available channels and where the largest 
degree of a node is ten times as large as the number of available channels. 

Another classic choice for the solution of the frequency assignment problem is the 
following. We choose a threshold value t and consider the graph = {V, E^) where 
:= {ij G E I > t}. Now we try to find the coloring number of G*, or try to 
color G* with N colors. In other words, we consider interference below t tolerable and 
try to find an assignment of channels to TRXs such that as few channels as possible are 
used (a color represents a channel, no two nodes with the same color are not allowed 
to be adjacent) or we try to use the available channels so that no “high interference” 
occurs. Of course, if for a given threshold t no feasible coloring can be found, one has 
to modify t and try again. 

This approach is unable to handle separation constraints and ignores adjacent channel 
interference. It was the “standard approach” in the early days of the mobile phone era 
but did not prove efficient in the more complex environment we have today. 



Minimizing Interference 

There are several other ways of modelling the frequency assignment problem mathe- 
matically, see [5]. For reasons of brevity I will focus on an approach that was employed 
in a joint project of the Konrad-Zuse-Zentrum and E-Plus, one of the four operators 
in Germany, and which has resulted in very satisfactory channel assignments. 

Let G = (V,E) be the interference graph introduced before. Let G = [1, ... ,7V] 
be the set of available channels, and let, for each TRX v G V, By denote the subset 
of channels blocked at node v. The values E-°{vw), c°''^{vw) denote, for each edge 
vw € E, the co-channel and adjacent-channel interference arising when TRXs v and 
w operate on the same or on adjacent channels. Moreover, let d{vw) G denote the 
separation necessary between the channels assigned to TRXs v and w. Thus, the input 
to a frequency assignment problem is a 7-tuple (V, E, C, {By}y^v, d, E°, briefly 
called network here. A frequency assignment for the network is a function y : V ^ C. 
It is called feasible if it satisfies the following side constraints 

y{v) G C\By for ally gV 

\y{v) — y{w) I > d{vw) for all vw G E 

The objective is to minimize the sum of co- and adjacent-channel interference, more 
formally: 



c“(uu>) -F Y. c'^'^ivw) 
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This version of the frequency assignment problem is a generalization of list colorings 
in graph theory and it is related to the well known T-coloring problem. 

There are several ways to reformulate (TAP) in terms of other standard models of com- 
binatorial optimization, e.g., there is a stable set model and a so called orientation model 
which is related to linear ordering. 

Several modifications of TAP have to be considered in practice. No operator wants 
to change all channel assignments whenever a new plan has to be computed. Some 
assignments have to stay fix (that is easy to achieve), sometimes one looks for the 
smallest number of channel adjustments within a certain range of interference, or one 
requires that, e.g., at most 100 of the assignments are changed. 

FAP is difficult in terms of complexity theory. Deciding whether a TRX network allows 
a feasible assignment is A/!P-complete; the optimization problem is strongly AT’-hard. 
FAP is also difficult in practice. Nobody can solve realistic instances to optimality. 
Satisfactory lower bounds on the objective function value are very hard to obtain. All 
approaches based on polyhedral combinatorics and linear programming have failed so 
far. There is some hope to exploit semidefinite relaxations of FAP. 

The whole available “zoo” of heuristics has been tried for the solution of FAPs. Consid- 
erable improvements over previous approaches can be achieved. There are some spec- 
tacular successes, but at present, the gap between lower and upper bounds - computable 
in practice - is still very large. 

In my faUc on this subject I will elaborate on the mathematical modelling of the FAP, 
on the development of heuristics and on the approaches with which lower bounds have 
been computed. I will present examples from practice that show what can be achieved 
today and how this mathematical approach compares to more traditional planning tech- 
niques. 

This lecture is based on joint work of the telecommunications group at the Konrad- 
Zuse-Zentrum, particular on the work of Andreas Eisenblatter [3]. Further references 
are [1], [2], [4]. The FAP website [5] is another excellent source of information. 
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Abstract. The ease with which one can copy and transform data on 
the Web, has made it increasingly difficult to determine the origins of a 
piece of data. We use the term data provenance to refer to the process 
of tracing and recording the origins of data and its movement between 
databases. Provenance is now an acute issue in scientific databases where 
it is central to the validation of data. In this paper we discuss some of 
the technical issues that have emerged in an initial exploration of the 
topic. 



1 Introduction 

When you find some data on the Web, do you have any information about how 
it got there? It is quite possible that it was copied from somewhere else on the 
Web, which, in turn may have also been copied; and in this process it may well 
have been transformed and edited. Of course, when we are looking for a best 
buy, a news story, or a movie rating, we know that what we are getting may be 
inaccurate, and we have learned not to put too much faith in what we extract 
from the Web. However, if you are a scientist, or any kind of scholar, you would 
like to have confidence in the accuracy and timeliness of the data that you are 
working with. In particular, you would like to know how it got there. 

In its brief existence, the Web has completely changed the way in which data 
is circulated. We have moved very rapidly from a world of paper documents 
to a world of on-line documents and databases. In particular, this is having a 
profound effect on how scientific research is conducted. Let us list some aspects 
of this transformation: 

— A paper document is essentially unmodifiable. To “change” it one issues a 
new edition, and this is a costly and slow process. On-line documents, by 
contrast, can be (and often are) frequently updated. 

— On-line documents are often databases, which means that they have explicit 
structure. The development of XML has blurred the distinction between 
documents and databases. 

— On-line documents/databases typically contain data extracted from other 
documents/databases through the use of query languages or “screen-scrap- 
ers” . 

Among the sciences, the field of Molecular Biology is possibly one of the 
most sophisticated consumers of modern database technology and has generated 
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a wealth of new database issues [15]. A substantial fraction of research in ge- 
netics is conducted in “dry” laboratories using in silico experiments - analysis 
of data in the available databases. Figure 1 shows how data flows through a 
very small fraction of the available molecular biology databases^. In all but one 
case, there is a Lit - for literature - input to a database indicating that this is 
database is curated. The database is not simply obtained by a database query 
or by on-line submission, but involves human intervention in the form of addi- 
tional classification, annotation and error correction. An interesting property of 
this flow diagram is that there is a cycle in it. This does not mean that there is 
perpetual loop of possibly inaccurate data flowing through the system (though 
this might happen); it means that the two databases overlap in some area and 
borrow on the expertise of their respective curators. The point is that it may 
now be very difficult to determine where a specific piece of data comes from. 
We use the term data provenance broadly to refer to a description of the origins 
of a piece of data and the process by which it arrived in a database. Most im- 
plementors and curators of scientific databases would like to record provenance, 
but current database technology does not provide much help in this process for 
databases are typically rather rigid structures and do not allow the kinds of ad 
hoc annotations that are often needed for recording provenance. 




Fig. 1. The Flow of Data in Bioinformatics 



The databases used in molecular biology form just one example of why data 
provenance is an important issue. There are other areas in which it is equally 
acute [5]. It is an issue that is certainly broader than computer science, with legal 

^ Thanks to Susan Davidson, Fidel Salas and Chris Stoeckert of the Bioinformatics 
Center at Penn for providing this information. 
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and ethical aspects. The question that computer scientists, especially theoretical 
computer scientists, may want to ask is what are the technical issues involved 
in the study of data provenance. As in most areas of computer science, the hard 
part is to formulate the problem in a concise and applicable fashion. Once that is 
done, it often happens that interesting technical problems emerge. This abstract 
reviews some of the technical issues that have emerged in an initial exploration. 

2 Computing Provenance: Query Inversion 

Perhaps the only area of data provenance to receive any substantial attention 
is that of provenance of data obtained via query operations on some input 
databases. Even in this restricted setting, a formalization of the notion of data 
provenance turns out to be a challenging problem. Specifically, given a tuple t 
in the output of a database query Q applied on some source data D, we want to 
understand which tuples in D contributed to the output tuple t, and if there is a 
compact mechanism for identifying these input tuples. A natural approach is to 
generate a new query Q', determined by Q, D and t, such that when the query 
Q' is applied to D, it generates a collection of input tuples that “contributed 
to” the output tuple t. In other words, we would like to identify the provenance 
by inverting the original query. Of course, we have to ask what we mean by con- 
tributed to? This problem has been studied under various names including “data 
pedigree” and “data lineage” in [1, 9, 7]. One way we might answer this question 
is to say that a tuple in the input database “contributes to” an output tuple if 
changing the input tuple causes the output tuple to change or to disappear from 
the output. This definition breaks down on the simplest queries (a projection or 
union). A better approach is to use a simple proof-theoretic definition. If we are 
dealing with queries that are expressible in positive relational algebra (SPJU) 
or more generally in positive datalog, we can say that an input tuple (a fact) 
“contributes to” an output tuple if it is used in some minimal derivation of that 
tuple. This simple definition works well, and has the expected properties: it is 
invariant under query rewriting, and it is compositional in the expected way. 
Unfortunately, these desirable properties break down in the presence of negation 
or any form of aggregation. To see this consider a simple SQL query: 

SELECT name, telephone 
FROM employee 

WHERE salary > SELECT AVERAGE salary FROM employee 

Here, modifying any tuple in the employee relation could affect the presence of 
any given output tuple. Indeed, for this query, the definition of “contributes to” 
given in [9] makes the whole of the employee relation contribute to each tuple 
in the output. While this is a perfectly reasonable definition, the properties of 
invariance under query rewriting and compositionality break down, indicating 
that a more sophisticated definition may be needed. 

Before going further it is worth remarking that this characterization of prove- 
nance is related to the topics of truth maintenance [10] and view maintenance 
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[12]. The problem in view maintenance is as follows. Suppose a database (a view) 
is generated by an expensive query on some other database. When the source 
database changes, we would like to recompute the view without recomputing 
the whole query. Truth maintenance is the same problem in the terminology of 
deductive systems. What may make query inversion simpler is that we are only 
interested in what is in the database; we are not interested in updates that would 
add tuples to the database. 

In [7] another notion of provenance is introduced. Consider the SQL query 
above, and suppose we see the tuple ("John Doe" , 12345) in the output. What 
the previous discussion tells us is why that tuple is in the output. However, we 
might ask an apparently simpler question: given that the tuple appears in the 
output, where does the telephone number 12345 come from? The answer to this 
seems easy - from the "John Doe" tuple in the input. This seems to imply that 
as long as there is some means of identifying tuples in the employee relation, 
one can compute where-provenance by tracing the variable (that emits 12345) 
of the query. However, this intuition is fragile and a general characterization is 
not obvious; it is discussed in [7]. 

We remark that this second form of provenance, where-provenance, is also 
related to the view update problem [3]: if John Doe decides to change his tele- 
phone number at the view, which data should be modified in the employee 
relation? Again, where-provenance seems simpler because we are only interested 
in modifications to the existing view; we are not interested in insertions to the 
view. 

Another issue in query inversion is to capture other query languages and 
other data models. For example, we would like to describe the problem in object- 
oriented [11] or semistructured data models [2] (XML). What makes these models 
interesting is that we are no longer operating at the fixed level of tuples in the 
relational model. We may want to ask for the why- or where-provenance of some 
deeply nested component of some structure. To this end, [7] studies the issue 
of data provenance in a “deterministic” model of semistructured data in which 
every element has a canonical path or identifier. Work on view maintainence 
based on this model has also been studied in [14]. This leads us to our next 
topics, those of citing and archiving data. 

3 Data Citation 

A digital library is typically a large and heterogeneous collection of on-line docu- 
ments and databases with sophisticated software for exploring the collection [13]. 
However many digital libraries are also being organized so that they serve as 
scholarly resources. This being the case, how do we cite a component of a digital 
library. Surprisingly, this topic has received very little attention. There appear 
to be no generally useful standards for citations. Well organized databases are 
constructed with keys that allow us uniquely to identify a tuple in a relation. 
By giving the attribute name we can identify a component of a tuple, so there 
is usually a canonical path to any component of the database. 
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How we cite portions of documents, especially XML documents is not so 
clear. A URL provides us with a universal locator for a document, but how 
are we to proceed once we are inside the document? Page numbers and line 
numbers - if they exist - are friable, and we have to remember that an XML 
document may now represent a database for which the linear document structure 
is irrelevant. There are some initial notions of keys in the XML standard [4] 
and in the XML Schema proposals [16]. In the XML Document Type Descriptor 
(DTD) one can declare an ID attribute. Values for this attribute are to be unique 
in the document and can be used to locate elements of the document. However 
the ID attribute has nothing to do with the structure of the document - it is 
simply a user-defined identifier. 

In XML-Schema the definition of a key relies on XPath [8] , a path description 
language for XML. Roughly speaking a key consists of two paths through the 
data. The first is a path, for example Department/Employee, that describes the 
set of nodes upon which a key constraint is to be imposed. This is called the 
target set. The second is another path, for example IdCard/Number that uniquely 
identifies nodes in the target set. This second part is called the key path, and 
the rule is that two distinct nodes in the target set must have different values 
at the end of their key paths. Apart from some details and the fact that XPath 
is probably too complex a language for key specification, this definition is quite 
serviceable, but it does not take into account the hierarchical structure of keys 
that are common in well-organized databases and documents. 

To give an example of what is needed, consider the problem of citing a 
part of a bible, organized by chapter, book and verse. We might start with 
the idea that books in the bible are keyed by name, so we use the pair of paths 
(Bible/Book, Name) . We are assuming here that Bible is the unique root. Now 
we may want to indicate that chapters are specified by number, but it would 
be incorrect to write (Bible/Book/Chapter, Number) because this says that 
that chapter numbers are unique within the bible. Instead we need to specify a 
re^otzve fee?/ which consists of a triple, (Bible/Book, Chapter, Number). What 
this means is that the (Chapter, Number) key is to hold at every node specified 
by by the path Bible/Book. 

A more detailed description of relative keys is given in [6] . While some basic 
inference results are known, there is a litany of open questions surrounding 
them: What are appropriate path languages for the various components of a 
key? What inference results can be established for these languages? How do 
we specify foreign keys, and what results hold for them? What interactions are 
there between keys and DTDs. These are practical questions that will need to 
be answered if, as we do in databases, use keys as the basis for indexing and 
query optimization. 
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4 Archiving and Other Problems Associated with 
Provenance 

Let us suppose that we have a good formulation, or even a standard, for data 
citation, and that document A cites a (component of a) document B. Whose 
responsibility is it to maintain the integrity of B? The owner of B may wish to 
update it, thereby invalidating the citation in A. This is a serious problem in 
scientific databases, and what is commonly done is to release successive versions 
of a database as separate documents. Since one version is - more or less - an 
extension the previous version, this is wasteful of space and the space overhead 
limits the rate at which one can release versions. Also, it is difficult when the 
history of a database is kept in this form to trace the history of components 
of the database as defined by the key structure. There are a number of open 
questions : 

— Can we compress versions so that the history of A can be efficiently recorded? 

— Should keeping the cited data be the responsibility of A rather than B? 

— Should B figure out what is being cited and keep only those portions? 

In this context it is worth noting that, when we cite a URL, we hardly ever give 
a date for the citation. If we did this, at least the person who follows the citation 
will know whether to question the validity of the citation by comparing it with 
the timestamp on the URL. 

Again, let us suppose that we have an agreed standard for citations and 
that, rather than computing provenance by query inversion (which is only possi- 
ble when the data of interest is created by a query,) we decide to annotate each 
element in the database with one or more citations that describes its provenance. 
What is the space overhead for doing this? Given that the citations have struc- 
ture and that the structure of the data will, in part, be related to the structure 
of the data, one assumes that some form of compression is possible. 

Finally, one is tempted to speculate that we may need a completely different 
model of data exchange and databases to characterize and to capture provenance. 
One could imagine that data is exchanged in packages that are “self aware” ^ and 
somehow contain a complete history of how they moved through the system of 
databases, of how they were constructed, and of how they were changed. The 
idea is obviously appealing, but whether it can be formulated clearly, let alone 
be implemented, is an open question. 
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Abstract. The problem of computing the strength and performing opti- 
mal reinforcement for an edge-weighted graph G{V, E, w) is well-studied 
[1,2, 3, 6, 7, 9]. In this paper, we present fast (sequential linear time and 
parallel logarithmic time) on-line algorithms for optimally reinforcing 
the graph when the reinforcement material is available continuosly on- 
line. These are first on-line algortithms for this problem. Although we 
invest some time in preprocessing the graph before the start of our al- 
gorithms, it is also shown that the output of our on-line algorithms is 
as good as that of the off-line algorithms, making our algorithms viable 
alternatives to the fastest off-line algorithms in situtations when a se- 
quence of more than 0(|P|) reinforcement problems need to be solved. 
In such a situation the time taken for preprocessing the graph is less that 
the time taken for all the invocations of the fastest off-line algorithms. 
Thus our algorithms are also efficient in the general sense. The key idea 
is to make use of the theory of Principal Partition of a Graph. Our results 
can be easily generalized to the general setting of principal partition of 
nondecreasing submodular functions. 



1 Introduction 

Let G{V,E) denote a graph with V as the vertex set and E as the set of edges. 
We use G{V,E,w) to denote a graph G{V,E) with nonnegative edge-weights 
given by w{.). 

A fundamental problem concerning practical networks is that of making the 
connectivity reliable in the face of failure of individual edges. Cunningham [3] 
defined strength of a graph G(V, E, w) as 

w(X) 

mm : : 

$^x<ZE number of additional components created by destroying X 

which is same as, mina^xcs j.{E)^r^E-x) ’ "'^here r{Z) denotes the rank of the 
subgraph on edge set Z, which is defined as the sum of the ranks of the connected 



S. Kapoor and S. Prasad (Eds.): FST TCS2000, LNCS 1974, pp. 94—105, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




Fast On-Line/OfT-Line Algorithms for Optimal Reinforcement of a Network 



95 



components of the subgraph on Z . Recall that the rank of a connected graph 
equals the number of vertices minus one. Thus the strength of a graph is a 
measure of its invulnerability. 

In [3] Cunningham considered the problem of computing the strength of a 
graph G(V, E,w), and optimal reinforcement of the edge- weights to raise the 
strength to a prescribed level. 

In this paper we show the relation between these problems and the notion 
of the Principal Partition (PP) of a graph. We show that computing the PP of 
a graph beforehand allows us to solve the “successive (on-line) reinforcement” 
problem very efficiently. Furthermore, it also gives us a lot of information about 
the relationship between the amount of reinforcement material available and 
maximum strength realizable by utilizing it. Throughout, the algorithms based 
on PP are extremely simple and efficient. 

The problem of computing optimal reinforcement to strengthen a graph 
G{V,E,w) is as follows: 

Problem 1. Main Problem: Optimal Reinforcement of a Graph 

Given G(V, E, w) and the required strength A, find a vector of increase (zero 
increase allowed) in the weights of the edges such that with the resulting edge- 
weights the graph has strength equal to A and the total increase is minimum. 
(We are not permitted to introduce new edges.) 

Using the well known ideas from “fractional programming” [3,4], one sees 
that it would suffice to consider the following family (parameterized by real 
values A) of problems: 

min{w(A:) - A * {r{E) - r{E - AT))}. 



or equivalently, 

max{w(A:)-A*r(A:)}. 

This latter is the Weighted Principal Partition Problem 
Given a graph G{V, E) and a real positive weight assignment w to the edges, to 
find for each real X, the collection of all subsets of E which maximize w(.) — A * 
r(.). 

It turns out that it is sufficient to consider not more that r(E) values of A to 
solve this problem completely. Using these “critical values” the PP (the above 
collection of subsets) can be constructed and stored efficiently [6,16,17,19]. 

It may be remarked that efficient algorithms for computation of PP for the 
real weights case have already been given in [10,14,16,17] earlier. But to reveal the 
connection between the classical idea of PP and the “strength and reinforcement” 
problem, we present a new approach to computation of the required information 
from the PP, which runs in the same time as required by the previous best 
algorithms in [16,17]. 

The problem of computing strength and optimal reinforcement has been stud- 
ied and solved efficiently by several researchers [1,2, 3, 6, 7, 9]. The best algorithms 
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are due to Gabow [7] and Cheng and Cunningham [2]. They solve the prob- 
lems of computing strength and min-cost optimal reinforcement each in time 
0(|yp|if| log(|yp/|if|)). It may be remarked that, there are no known faster 
algorithms for the unit-cost optimal reinforcement problem. The ideas underly- 
ing the best algorithms also appeared in [1,3,9]. We may also mention that the 
many of the fast algorithms mentioned above are based on ideas from [8]. The 
connection of the problem of computing the strength with the classical problem 
of Principal Partition was brought out by Fujishige [6], who related the strength 
to the smallest critical value in a principal partition of a dual of the rank function 
of the given graph. We extend this connection of principal partition to optimal 
reinforcement, too, using it to build fast oracles suitable for on-line real-time 
computations. 

We mainly consider the practical problem of doing optimal reinforcement. 
After computation of a certain skeleton of the principal partition, called Principal 
Sequence, we solve the problem of optimal reinforcement in fast time (sequential 
linear time and parallel logarithmic time) for every successive request of optimal 
reinforcement. Thus our algorithm is indeed on-line and real-time too. When the 
number of requests for successive optimal reinforcement is larger that 0(|P|), 
our algorithms turn out to be faster than the fastest off-line algorithms due to 
[7] [2], without any loss of quality of output. 

We solve the following problems which relate to on-line version of the optimal 
reinforcement problem: 

Problem 2. Problem (PI): 

Given a graph G{V, E,w), build an efficient oracle for the function IT(.) that 
maps the required strength A to the minimum total amount of weight augmen- 
tation to be performed to increase the strength to A. 

The following is an “inverse” of the above problem: 

Problem 3. Problem (P2): 

Given a graph G{V, E, w) and a specified amount W of total weight augmenta- 
tion permitted, find the maximum strength achievable by using the amount W 
to augment weights of the edges. Note that we are not permitted to decrease 
any of the existing weights. 

Now consider the following practical situation: Reinforcement material is 
made available in arbitrary quantities at arbitrary intervals. At every stage when 
the reinforcement arrives, we are required to utilize the reinforcement fully and 
optimally (that is, without saving some amount for future use, and making 
sure that the graph is strengthened to the best level using the current lot of 
reinforcement) . 

Furthermore, at any stage, the cumulative augmentation (or reinforcement) 
should look as if it were the optimal reinforcement done if the whole reinforce- 
ment were made available in one go and we were supposed to utilize it optimally. 

We formally model the above scenario as the following problem and solve it 
in this paper: 
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Problem 4- Problem (P3): 

Given a graph G(V, E,w), build the oracles for the family of monotonic non- 
decreasing functions: 



{/e : [Ai,oo]^R+ leGi;}, 
which satisfy the following: 

For A > At, {/e(A) \ e G E} represent the weight augmentations to be carried 
out for each edge e & E such that the strength of the graph G{V, E) goes up to 
A. Furthermore, X)ee-E /e(^) i® required to be minimum. 

Thus the decisions as to how much weight to be augmented for a given edge 
in the current stage would be taken by querying the above oracles. We would 
like these oracles to be efficient so that the above decisions could be taken in 
real-time, and we also want an efficient algorithm for building this family of 
oracles. It is clear that as a consequence of building the above oracles, we would 
have on-line and real-time algorithm to solve the problem of reinforcement of 
a graph optimally. 

Now we review the literature for PP briefly: The PP of a graph for the case 
A = 2 was constructed by Kishi and Kajitani (see [16] for details). For arbi- 
trary real A the problem was solved for matroids independently by Narayanan 
[13] and Tomizawa [19]. Extensive work has been done on these problems and 
their generalization to matroids, particularly in Japan [5,6,10,11,14,16,17,19]. 
Details may be found in [6,16]. Principal Partition problem also has many sig- 
nificant applications- to Electrical Network theory (see, for instance, [11,16]), 
Fault tolerant computing [12] and to Engineering Systems in general [11]. Using 
the Principal Partition of a graph, approximate algorithms were designed for 
an NP-hard problem of computing Min-k-cut of a graph (see [15]). Patkar and 
Narayanan [16,17] gave an 0(|if||U|^?og|U|) algorithm for computing the whole 
Weighted Principal Partition of the graph. 

Due to lack of space, certain details are omitted. Interested reader is referred 
to [18] for complete version of this paper. 

2 Preliminaries and Notation 

We deal throughout with finite sets. A function / on the subsets of S is said 
to be submodular if f{A) + f{B) > f{A U B) -I- f{A n B) . . ,yA,B C S. f 
is said to be supermodular iff — / is submodular. We say that a function is 
normalized if takes value 0 on the emptyset. A function / is nondecreasing if 
X C-Y ^ f{X) < f(Y). A normalized, nondecreasing and submodular function 
is also called a polymatroid function. 

G{V, E) xZ denotes the graph obtained by contracting E — Z from G(U, E). 
Note that we maintain G x Z as a, multigraph, that means, every edge in G x 
Z corrsponds to some edge in the original graph G. G(U, E) • Z denotes the 
subgraph of G(U, E) induced by Z. By abuse of notation r(G) denotes the rank 
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of the set of edges of the graph G. Note that r(.) is normalized, nondecreasing 
and submodular function. 

V{X) denotes the set of vertices spanned by the edge set X. £{U) denotes the 
set of edges having both the endpoints in U. The subgraph of G on C/ C y is 
denoted by {U, £{U)). Note that w{.) — k* r(.) is a supermodular function when 
A: > 0. 

The following fact from the literature on the submodular functions [6,16,19] 
is very basic and useful. 

Theorem 1. [6,13,16,19] Let W be a set and U C W. Let f : 2^ ^ R be a 
submodular (supermodular) function. Then the subsets which minimize (maxi- 
mize) the function /(.) over all those subsets of W which contain U, form a 
lattice under the usual operations of union and intersection. Ln particular there 
exist unique smallest and largest such sets. 

3 Some Relevant Properties of Principal Partition 

Some properties of the Principal Partition which are relevant to this paper are 
as follows [16,19]: 

1. There is a unique maximal set X^ and a unique minimal set Xk at which 
w{.) — k * r(.) reaches the maximum. We call these sets critical sets in the 
Principal Partition of the graph. 

2. If ki > ^ 2 , it can be shown that X^^ C Xk^. 

3. For finitely many values of k, Xk yf X^ . Such values are called critical values 
in the Principal Partition of the graph. 

4. There are at most r{E) critical values in the Principal Partition of the graph. 

5. Let Ai > A 2 > As . . . > At be the sequence of all critical values. The last 
critical value At (see [6]) is equal to the strength of the given graph G under 
the edge-weights w. We also take Aq = 00 as a convention. 

Furthermore, X^* = for i = 1, 2, . . . , t — 1, and X\^ = 0, = E. 

The sequence X\„ C X\^ . . . X\^ C X^* is called the Principal Sequence. 

The following characterization of strength (thus that of smallest critical 
value) is implicit in [3]. 

Lemma 1. Let G{V,E,w) be the given graph with edge-weights given by w{.). 
Let a be such that w{Z)—a*r{Z) = w{E) — (j*r{E) = ma,xx(iE{w{X) — a*r{X)} 
for a proper subset Z C E. Then a is the strength of G{V,E,w). The converse 
also holds. 

We state the following result about the computation of the Principal Se- 
quence of G(y, E, w) which will follow from the algorithm in Section 7. 

Theorem 2. The Principal Sequence of G{V,E,w) can be computed using 
0(|y|) invocations of the subroutine that computes the strength of a given sub- 
graph of G{V, E,w). 




Fast On-Line/OfT-Line Algorithms for Optimal Reinforcement of a Network 



99 



4 An Algorithm Based on Principal Partition for 
Minimum- Weight Reinforcement 

In this section we present an algorithm which will perform minimum-weight 
reinforcment using the Principal Sequence which is assumed to be available. 
The ideas from this algorithm will be used to build on-line algorithm (using the 
oracles as described in Problem (P3)) in the later section. We start with the 
following definition. 

Definition 1. Let Xq C Xi C X 2 . . . C Xt (with Xq = 0 and Xt = E) he the 
Principal Sequence of G(V, E,w). Let Xi > X 2 ■■■> Xt be the sequence of critical 
values. Let Ei = Xi — Xi-i, and Gi = {G • Xi) x {Xi — Xi-i) for i = 1, 2, ... t. 

Thus Gi is a minor of G obtained by first restricting it to Xi and then con- 
tracting out the subset Xi-i. By one of the properties of PP, Xt is the strength 
of G{V, E,w). We wish to increase the strength to A by augmenting the weight 
function w to suitable w. We also require that the total augmentation in the 
weights, that is {w — w){E), is as small as possible. 

4.1 Algorithm 1 

Our algorithm is as follows: 

Algorithm 1 

— Let p be smallest index such that X> Xp. Thus Ap_i > A > Ap. 

— Let Ep be a subset of edges of E that forms a spanning forest of Gp. Recall 
that we maintain Gfs (graphs after contraction) as multigraphs, that means, 
every edge in Gi corrsponds to some edge in the original graph G. 

— We add A — Ap to the weight of each of the edges in Fp. 

— For each j from p -|- 1 to t we do something similar: 

• Let Fj be a subset of edges of E that forms a spanning forest of Gj. 

• We add A — Xj to the weight of each of the edges in Fj. 

4.2 Proof of Correctness of Algorithm 1 

We need to prove the following. 

Theorem 3. Algorithm 1 uses minimum weight augmentation in order to in- 
crease the strength of a graph G(V, E, w) to a prescribed level. Furthermore, the 
minimum weight augmentation required to increase the strength from Xt to X 
equals X)j=p(^ ~ A) * rank{Gj). 

To prove the theorem 3 we will use the following definitions and results. 

Definition 2. Let G{V , E) he a graph with rank function r(.) on the subset of 
edges. Let w{.) be a non-negative weight function on the edges of this graph and 
let X be a non-negative real. We say that the graph is molecular with respect to 
{w, A) if 

w(0) — X * r(0) = w(E) — X * r(E) = max{w(A) — A * r{X)} (1) 

XCE 
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We need a few lemmas. 

By the characterization of strength (lemma 1) it is clear that, 

Lemma 2. A graph is molecular w.r.t. (w,X), if and only if its strength is 
equal to A when the edge-weights are given by w(.), and the total edge-weight of 
such a graph is equal to (A * rank of the graph). 

The proofs of the following lemmas (lemma 3 and lemma 4) will easily follow 
from the algorithm given in a later section that constructs the PP of G{V, E, w). 



Lemma 3. For each i = 1,2, . . .t, the strength of G • {E\ U i ?2 U . . . Ei) equals 

A,. 

Lemma 4. Let Gi be as defined before for i = 1,2, .. .t. Strength of Gi is equal 
to Xi and it is molecular with respect to (w,Xi). 

We state the following lemmas without proofs (see [18] for complete details). 



Lemma 5. Let a be the strength of G{V,E,w). Let F be any spanning forest 
of G{V,E). Lf we increase the weights of each edge of F by a, then the strength 
of the weight- augmented graph is at least a -\- a. Furthermore, if G{V, E,w) is 
molecular w.r.t. (w,a) then the resulting strength is equal to a -\- a, and the 
resulting graph is molecular w.r.t. {w' ,a-\-a), where w' denotes the augmented 
weights. 

Lemma 6. Let G'{V ,E') be a graph with edge-weights w' . Let a be the strength 
of G' . Let % ^ Z G E' such that G' • Z has strength a\ that is at least as large 
as a. Further suppose G' x {E' — Z) is molecular w.r.t. (w',a). Let ai > X > a. 
Then addition of weight X — a to any spanning forest E of G' x {E' — Z) raises 
the strength of G' to X. 

We state a simple lemma about strength after any contraction or augmenta- 
tion operation. The proof follows immediately from the definitions. 

Lemma 7. Strength of a graph does not decrease after any augmentation or 
contraction. 

Now we are ready to prove theorem 3. 

Proof of Theorem 3: 

We first establish that the resulting weight function, provides the required 
strength, that is, X to the graph G{V,E). 

We use the following notation: Gi = G* {Ei U . . . Ef). Gi is as defined before, 
and let r^(.) denote the rank function of Gi for i = 1,2, ... ,t. 

We prove by induction on i = p,p-\-l, . . . ,t, that after the augmentation step 
on Gi, Gi has strength A, but the strength of Gi+\ remains at Ai+i. 

Induction base: Lemma 3 and lemma 4 tell us that prior to augmentation step 
the strength of Gp is Ap, strength of Gp_i is Ap_i and strength of Gp is Ap. 




Fast On-Line/OfT-Line Algorithms for Optimal Reinforcement of a Network 101 



lfp=l the we make use of lemma 5 to conclude that strength of Gp becomes 
A. Otherwise, we apply lemma 6 to Gp, Gp-i and Gp. This establishes that after 
the augmentation step on Gp, the strength of Gp becomes A. But, under the 
augmented weights, the strength of Gp+i remains at Ap+i as 

strength of Gp+i < strength of Gp+i = Ap+i. 

and the strength of Gp+i would not decrease from its earlier value of Ap+i after 
augmentation step (The inequality in the above follows from the lemma 7). Thus 
the induction base is proved: 

The proof of the inductive hypothesis is along similar lines as above. In fact, 
the key idea is once again the use of lemma 6. 

Now we establish that the weight added is minimum that is required to increase 
the strength to the prescribed level, A. 

Towards this, one may look at the graph obtained from G(V, E) by contract- 
ing out the set of edges ExU E^. ■ ■ Ep-i. The resulting graph is on the set of 
edges Ep U ifp+i ■ ■ - Et. Let G' denote this resulting graph. 

We once again make use of the lemma 7. For the strength of G{V, E) to be 
A, it is required that the strength of G' should be at least A. Thus, G' must have 
at least A * rank(G') as the total weight of the edges after the augmentation. 
Now rank{G') = Y^*j^prank{Gj). Thus the weight of the the edges of G' is 

at least X)j=p ^ * rank{Gj). 

But original weight of the edges of G' was J2]=p * rank{Gj). This follows 

from molecularity of Gj w.r.t. (w, Xj) and lemma 2 . 

Thus the minimum weight augmentation required to increase the strength 
from At to A equals 

t 

^(A- Aj) *ran/c(Gj). 
j=p 

But then, our algorithm has used exactly the same (as above) amount of 
weight augmentation for increasing the strength, thus our algorithm has per- 
formed minimum weight augmentation of edge weights to increase the 
strength from At to A. q.e.d 

5 Minimum Weight Successive Augmentation to Increase 
the Strength of a Graph 

In this section we show how the ideas underlying Algorithm 1 can be used to 
solve Problem (P3). 

To show this, we make use of the spanning forests Ei, F 2 , ■ . . Ft of the graphs 
Gi , G 2 , . . . , Gt, respectively. 

Define for A > At, 

/e(A) = A — Aj if e e Fj and A > A^ . s 

= 0 otherwise ' 
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Time required to build efficient oracles for the functions fe{-), e G if, is clearly 
dominated by the time required for finding the sets Ei, E 2 , ■ ■ ■ , Et, which (by 
theorem 2) may be done by 0(|y|) invocations of Cheng and Cunningham’s or 
Gabow’s algorithm from [2,7]. 

After choosing the spanning forests Ei, F 2 , ■ ■ . Ft, we remember in each oracle 
for /e(.) the pair (j, A^) if e G Fj for some j, and if no such j exists the oracle 
returns 0. 

Clearly, each oracle /e(-) answers in constant time. 

Theorem 4. If we use ideas in Algorithm 1, then the time required to solve 
Problem (P3) is (jCj * Tstrength), where Tgtrength denotes the time required 
to compute the strength of graph G{V, E,w). The space complexity of the oracle 
for /e(.) for any edge e G E is also 0(1). Furthermore, each of the oracle of 
Problem (P3) requires constant time to provide answer, if it is fed with the 
input A. 

From the above discussion and Algorithm 1 we get the following lemma. 

Lemma 8. The function W (.) that maps strength A to the minimum total weight 
augmentation required to increase the strength to A is given by the following 
piecewise linear function: W : [At,oo] — > R+ and it satisfies, 

— W{\t) = 0, 

— and the slope ofW{.) in the domain interval [Ai+i,Ai] is equal to 

E5=*+i^(G'i)- 

Thus an efficient oracle W{.) as well as its inverse map could be built easily, 
after having computed the graphs Gi, G 2 , ■ ■ ■ ,Gt by 0(|C|) invocations of the 
algorithm that finds strength of a graph (on a sequence of smaller and smaller 
graphs). 

Note that in then construction of the above oracle for W{.), one could make 
use of the values and the slopes of W{.) in the domain intervals 

[Ai+i , A*] for i = 0, 1, 2, . . . , t - 1. 

With this information in the oracle, the oracle and its inverse can provide 
the answers in time logarithmic in \V\. Thus we have a good solution for 
problems Problem (PI) and Problem (P2) 

6 Another Algorithm for Optimal Reinforcement of a 
Graph 

We now present another, slightly modified, algorithm which would give a tech- 
nique for a different approach for solving Problem (P3) (see [18] for details). 

Algorithm 2 

— Let p be smallest index such that X> Xp. Thus Ap_i > A > Ap. 

— Let Fp be a subset of edges of E that forms a spanning forest of 

G X {Ep U Ep^i U . . . Et). 
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— We add A — Ap to the weight of each of the edges in Fp. 

— For each j from p + 1 to t we do something similar: 

• Let Fj be a subset of edges of E that forms a spanning forest of 

G X {Ej U Ej+i U . . . Ft). 

• We add A — Xj to the weight of each of the edges in Fj. 

Theorem 5. The above algorithm computes a minimum weight augmentation 
to increase the strength from A* to X. 

7 An Algorithm to Compute Principal Sequence 

In what follows, we describe a new approach based on Cunningham’s algorithm 
[3] for computation of the Principal Sequence of a graph. 

A simple modification of Cunningham’s [2,3] algorithm computes cr and the 
smallest proper subset Z C E such that 

w{Z) — a * r(Z) = w{E) — a * r{E) = max{w(AT) — a * r{X)} (3) 

7.1 Computation of Principal Sequence Using the Subroutine to 
Compute the Strength 

Let fii denote the strength of G(V, E, w). Let Hi denote the largest subset of E 
such that 

w{E — Hi) — Hi* r{E — Hi) = w{E) — Hi* r(if) = max{r<;(Ar) — Hi* r(AT)}. (4) 

X^E 

Let H 2 , H 2 be obtained by performing the above procedure on the graph 

G{V,E),{E-Hi). 

Let H 3 : H 3 be obtained by performing the above procedure on the graph 

G{V,E) • {E-{HiUH 2 )) 



and so on .... 

In general, let Hii Hi be obtained by performing the above procedure on the 
graph 

G(U, E)*{E-{Hi\JH 2\J... H,_i)). 

We stop the above process when we find Ht and Ht such that 



E = HiU H 2 U . . .U Ht. 



Noting that the sequence of ranks of successive subgraphs is strictly decreasing, 
one sees that. 



Lemma 9. t < r{E). 
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7.2 Proof of Correctness of the Above Algorithm for Computation 
of the Principal Sequence 

The discussion in this section will establish that the above algorithm has indeed 
built the Principal Sequence of the rank function of an edge-weighted graph. 

Let fJ. 1 , iJ- 2 , ■ ■ ■ IJ^t and Hi, H 2 , ... iLt be as defined above, for an edge-weighted 
graph G{V,E,w). Let us define Xi = fit+i-i, and Ei = Ht+i-i- We also follow 
the convention that Aq = 00 and Eq = %. 

We make use of the following technical lemma (see [18] for proof). 

Lemma 10. Let a be the strength of G{V, E,w). Let Z be the smallest (proper) 
subset of E such that 

w{Z) — a * r(Z) = w{E) — a * r{E) = max{w(A) — a * r{X)} (5) 



then, 

1. G{V, E) X {E — Z) is molecular w.r.t. {w, a), and therefore has strength equal 
to a. 

2. Lf Z then, G{V, E) • Z has strength strictly greater than a. 

We will use the following well-known characterization of the Principal Se- 
quence (see [16], ppAQA). 

Theorem 6. [13,16,19] Zg C Zi C ... C Zi, with Zq = 0 and Z\ = E, is the 
Principal Sequence of G(V, E, w) if and only if there exists 71 > 72 • ■ • > 7 ; 
such that for each i = 1,2, ... , 1 , {G»Zi) x {Zi — Zi-i) is molecular w.r.t. {w, ji). 
Furthermore, the Principal Sequence exists and is unique. 

Thus using theorem 6 and lemmas 2 3 4 it is clear that, 

Theorem 7. The algorithm to preprocess the graph has decomposed the edge set 
E into Ei,E 2 , . . . ,Et such that Gi (as defined before) is molecular w.r.t. {w, Xi) 
for i = 1,2, ... ,t. The nested sequence of sets 



0 C El C {El U E 2 ) . . . C {El U E 2 . . ■ Et) ( — E) 



is the Principal Sequence of the graph G{V, E,w). 

Using lemma 9 and the above theorem we obtain theorem 2. 
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Abstract We investigate a variant of on-line edge-coloring in which 
there is a fixed number of colors available and the aim is to color as many 
edges as possible. We prove upper and lower bounds on the performance 
of different classes of algorithms for the problem. Moreover, we determine 
the performance of two specific algorithms, First-Fit and Next-Fit. 



1 Introduction 

The Problem. In this paper we investigate the on-line problem Edge-Coloring 
defined in the following way. A number k of colors is given. The algorithm is 
given the edges of a graph one by one, each one specified by its endpoints. For 
each edge, the algorithm must either color the edge with one of the k colors, or 
reject it, before seeing the next edge. Once an edge has been colored the color 
cannot be altered and a rejected edge cannot be colored later. The aim is to 
color as many edges as possible under the constraint that no two adjacent edges 
receive the same color. 

Note that the problem investigated here is different from the classical version 
of the edge coloring problem, which is to color all edges with as few colors as 
possible. In [2] it is shown that, for the on-line version of the classical edge 
coloring problem, the greedy algorithm (the one that we call First-Fit) is optimal. 

The Measures. To measure the quality of the algorithms, we use the competitive 
ratio which was introduced in [6] and has become a standard measure for on- 
line algorithms. For the problem Edge-Coloring addressed in this paper, the 
competitive ratio of an algorithm A is the worst case ratio, over all possible input 
sequences, of the number of edges colored by A to the number of edges colored 
by an optimal off-line algorithm. 

In some cases it may be realistic to assume that the input graphs are all 
fc-colorable. Therefore, we also investigate the competitive ratio in the special 
case where it is known that the input graphs are fc-colorable. This idea is sim- 
ilar to what was done in [1] and [3]. In these papers the competitive ratio is 
investigated on input sequences that can be fully accommodated by an optimal 
off-line algorithm with the resources available (in this paper the resource is, of 
course, the colors). Such sequences are called accommodating sequences. This 
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is generalized in [4] , where the competitive ratio as a function of the amount of 
resources available is investigated. 

This paper illustrates an advantage of analyzing accommodating sequences, 
apart from tailoring the measure to the type of input. A common technique when 
constructing a difficult proof is to start out investigating easier special cases. In 
our analysis of the general lower bound on the competitive ratio, the case of 
/c-colorable input graphs was used as such a special case. 

The Algorithms. We will mainly consider fair algorithms. A fair algorithm is 
an algorithm that never rejects an edge, unless it is not able to color it. Two 
natural fair algorithms are Next-Fit and First-Fit described in Sections 4 and 5 
respectively. 

The Graphs. The lower bounds on the competitive ratio proven in this paper 
are valid even if we allow multigraphs. The adversary graphs used for proving 
the upper bounds are all simple graphs. Thus, the upper bounds are valid even if 
we restrict ourselves to simple graphs. Furthermore, the adversary graphs are all 
bipartite except one which could easily be changed to a bipartite graph. Thus, 
the results are all valid for bipartite graphs too. 

The Proofs. Due to space limitations we have omitted the details of some of the 
proofs. The full version can be found in [5]. 

2 Notation and Terminology 

We label the colors 1,2,... ,k and let Cfc = {1, 2, . . . , k}. 

Km,n denotes the complete bipartite graph in which the two independent 
sets contain m and n vertices respectively. 

The terms fair^, fair^, on-line^, and on-line^ denote arbitrary on-line algo- 
rithms from the classes “fair deterministic” , “fair randomized” , “deterministic” , 
and “randomized”, respectively, for the Edge-Coloring problem. The term 
off-line denotes an optimal off-line algorithm for the problem. 



3 The Competitive Ratio 

We begin this section with a formal definition of the competitive ratio for the 
problem Edge-Coloring. 

Definition 3.1. For any algorithm A and any sequence S of edges, let A{S) he 
the number of edges colored by A and let OPT{S) he the number of edges colored 
by an optimal off-line algorithm. Furthermore, let 0 < c < 1. 

An on-line algorithm A is c-competitive if there exists a constant b such that 
A{S) > c • OPT{S) — b, for any sequence S of edges. 

The competitive ratio of A is Ca = sup{c | A is c-competitive} . 
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3.1 A Tight Lower Bound for Fair Algorithms 

In this section a tight lower bound on the competitive ratio for fair algorithms 
is given. Note that it is not possible to give a general lower bound greater than 
0, since the algorithm that simply rejects all edges have a competitive ratio of 0. 

Theorem 3.2. For any fair on-line algorithm A for Edge-Coloring, Ca > 
2v^- 3 « 0.4641. 

Proof. Let denote the set of edges colored by fair^, let denote the set of 
edges colored by off-line and not by fair^, and let E^ denote the set of edges 
colored by both off-line and fair^. Thus, E^ U E^ are the edges colored by off- 
line, and Ed A Ec- Similarly, for any vertex x, let dc{x), du{x), and dd{x) denote 
the number of edges incident to x colored by fair^, not colored by fair^, and 
colored by both fair^ and off-line respectively. Let c be a constant such that 
0 < c < i. Then fair^ is c-competitive for any c such that \Ec\ > c(|ifd| + |Eu|), 
or \Ec\ - c\Ed\ > c|Eu|. 

Now, the intuition is that, for each edge e G Ec, fair^ earns one unit of some 
value. If fair^ can buy all edges in E^ U Ed paying the fraction c of a unit for 
each, then \Ec\ > c(|ifd| + \Eu\). fair^ starts out buying all edges in Ed, paying 
c for each. The remaining value is distributed to the edges in Eu in two steps. 
In the first step, each vertex x receives the value m{x) = ^(^dc{x) — cdd{x)). 
Note that = \^c\ — c\Ed\. In the next step, the value on each vertex 

is distributed equally among the edges in E^ incident to it. Thus, each vertex 
X with du{x) > 1 gives the value mu(x) = to each edge in E^ incident 

to it. Note that E(x,y)eE„ (mu(x) + TOu(y)) < E(x,y)eE„ (mu(x) + mu(y)) + 

- c|Ed|. Thus, if mu(x) + mu(y) > c for 
any edge (x,y) G E^, then c|Eu| < J2(x,y)eEu i'^u(x) +TOu(?/)) < \Ec\ - c\Ed\ 
and fair^ is c-competitive. 

The inequalities below follow from two simple facts. (1) For any vertex x G V, 
dd{x) -G du{x) < k, since off-line can color at most k edges incident to x. (2) For 
each edge {x, y) G E^, dc{x) -\- dc{y) > k, since fair^ is a fair algorithm. For any 
edge {x,y) G E„, 

+ "..(!/) = i 

2 V du(a:) du(y) 

W 1 / dc{x) - cddjx) dcjy) - cdd{y) \ 

“ 2\ k-dd{x) k-dd{y) ) 

W 1 / dc{x) - cdd{x) k - dcjx) - cdd{y) \ 

~ 2\ k-dd{x) k-dd{y) ) 

Calculations show that this expression is greater than or equal to c as long as 
c<2V3-3. □ 

In section 4 it is shown that values of k exist for which the competitive ratio 
of Next-Fit is arbitrarily close to 2-\/3— 3. Thus, the result in Theorem 3.2 is tight. 
The next theorem in conjunction with Theorem 3.2, shows that all deterministic 
fair algorithms must have very similar competitive ratios. 
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3.2 An Upper Bound for Fair Deterministic Algorithms 

Theorem 3.3. No deterministic fair algorithm A for Edge-Coloring is more 
than ^-competitive. 

Proof. We construct a simple graph G = (Vi U V 2 ,E) in two phases. In Phase 
1, only vertices in Vi are connected. In Phase 2, vertices in V 2 are connected to 
vertices in Vi. Let |Ui| = IV 2 I = n for some large integer n. 

In Phase 1, the adversary gives an edge between two unconnected vertices 
x,y G Vi with a common unused color. Since the edge can be colored, fair^ will 
do so. This process is repeated until no two unconnected vertices with a common 
unused color can be found. At that point Phase 1 ends. For any vertex x, let 
Cu{x) denote the set of colors not represented at x. At the end of Phase 1, the 
following holds true. For each color c and each vertex x such that c € Cu{x), 
X is already connected to all other vertices y with c G C'u(y). Since x can be 
connected to at most k other vertices, there are at most k vertices y ^ x such 
that c G Gu(y). Thus, C'u(a:) < k{k + 1). 

The edges given in Phase 2 are the edges of a fc-regular bipartite graph with 
Vi and V 2 forming the two independent sets. Note that, by Konig’s Theorem [7], 
such a graph can be fc-colored. 

From Phase 2, fair^ gets at most k(k-h 1) edges, but off-line rejects all edges 
from Phase 1 and accepts all edges from Phase 2, giving a performance ratio 
of at most iink-Hk+j))+kik+i) ^ nfc+Mfc+D = 1 + If allow n to be 
arbitrarily large, this can be arbitrarily close to j. □ 



3.3 A General Upper Bound 

Now follows an upper bound on the competitive ratio for any type of algorithm 
for Edge-Coloring, fair or not fair, deterministic or randomized. 

Theorem 3.4. For any algorithm A for Edge-Coloring Ga < |. 

Proof. In Fig. I, the structure of the adversary graph is depicted. Each box 
contains k vertices. When two boxes are connected, there are k^ edges in a 
complete bipartite graph between the 2k vertices inside the boxes. Note that 
such a graph can be fc-colored. The edges of the graph are divided into n levels, 
level 1, . . . ,n. The adversary gives the edges, one level at a time, according to 
the numbering of the levels. The edges of level i are given in three consecutive 
phases: 

1. Hi! Internal (horizontal) edges at level i. In total k'^ edges. 

2. Vp. Internal (vertical) edges between level i and level id- 1. In total 2k^ edges. 

3. E^: External edges at level i. In total 2k‘^ edges. 

Let Xu- be a random variable counting how many edges on-line^ will color 
from the set H^, and let and Ae^ count the colored edges from Vi and Ei 
respectively. For z = 0, . . . , n, let EXTi and INTi be random variables counting 
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Fig.l. Structure of the adversary graph for the general upper bound on the 
competitive ratio. 



the sum of all external and internal edges, respectively, colored by on-line^ after 
level i is given, i.e., EXT* = and INT* = + Xu.). Note 

that EXTo = INTo = 0. 

If the adversary stops giving edges after Phase 1 of level i, off-line will color 
/c^(2z — 1) edges in total, namely the edges in the sets Ei,E 2 , . . . ,Ej_i, and Hi. If 
the adversary stops giving edges after Phase 2 (or 3) of level i, off-line will color 
2k"^i edges, namely the edges in the sets Ei,E 2 , . . . ,Ei_i, and Vi. The proof is 
divided into two cases. 

Case 1: There exists a level i < n, where E[EXTi] > jk^i. 

Let i denote the first level such that if [EXTi] > jk'^i. Assume that the num- 
ber of edges colored by on-line^ is at least | of the number of edges colored by 
off-line. If the adversary stops the sequence after Phase I of level i, the following 
inequality must hold: 

(1) E[INT,_i] + E[EXT,_i] + E[AhJ > ^k\2i - 1). 

If the adversary stops the sequence after Phase 2 of level z, the following inequal- 
ity must hold: 

(2) E[INT,] -h E[EXTi_i] > ^"^21 

If on-line^ is ^-competitive, both inequalities must hold. Adding inequalities (1) 
and (2) yields 

(3) 2(E[INT,_i] -h E[EXTi_i]) -h 2E[Xh,] + E[Xy,] > fkH - ffcP 

Now, E[INT,_i] < i(2fc2(i _ 1) _ A[EXT,_i] - E[Xy,_,]) + E[Xy,_,], 
E[EXT,_i] < ^nd E[Xy,_,] + 2E[Xa,] + E[Xy,] < 2k^ - E[Xe,] < 

^kf . Inserting these inequalities into (3) we arrive at a contradiction. Thus, in 
this case on-line^ is not |-competitive. 

Case 2: For all i < n, if [EXT^] < jk'^i. 

The expected number of edges colored by on-line^ is 
E[INT„] -h E[EXT„] < \{2k'^n - E[EXT„] - E[XyJ\) + E[XyJ\ + E[EXT„] = 
en+\{E[EXIr,-i] + E[XE„] + E[XyJ) < P(n- 1) + 2P) = fPzz + 

ffcP 
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Thus, we get an upper bound on the performance ratio of ’’ ^ 

which can be arbitrarily close to if we allow n to be arbitrarily large. □ 

Thus, even if we allow probabilistic algorithms that are not necessarily fair, no 
algorithm is more than 0.11 apart from the worst fair algorithm when comparing 
competitive ratios. 

4 The Algorithm Next-Fit 

The algorithm Next-Fit {NF) is a fair algorithm that uses the colors in a cyclic 
order. Next-Fit colors the first edge with the color 1 and keeps track of the 
last used color Qast- When coloring an edge (u,v) it uses the first color in the 
sequence (ciast + Ij Cast + 2, . . . , fc, 1, 2, . . . , ciast) that is not yet used on any edge 
incident to u or v, if any. 

Intuitively, this is a poor strategy and it turns out that its competitive ratio 
matches the lower bound of section 3.1. Thus, this algorithm is mainly described 
here to show that the lower bound cannot be improved. 

When proving upper bounds for Next-Fit, it is useful to note that any coloring 
in which each color is used on exactly n or n -I- 1 edges, for some n G N, can 
be produced by Next- Fit, for some ordering of the request sequence. The colors 
just need to be permuted so that the colors used on n -I- 1 edges are the lowest 
numbered colors. 

Theorem 4.1. inf CNpik) = 2-\/3 — 3 « 0.4641. 

fceN 

Proof. The adversary constructs a graph Gnf in the following way. It chooses 
an X £ Ck as close to (-\/3 — l)k as possible and then constructs a {k — x)- 
regular bipartite graph Gi = (Li U Ri,Ei) with \Li\ = |i?i| = k and a graph 
G 2 = {L 2 U i? 2 , E 2 ) isomorphic to K^^x- Now, each vertex in Ri is connected to 
each vertex in L 2 and each vertex in R 2 is connected to each vertex in Li. Call 
these extra edges £' 12 . The graph Gnf for A: = 4 is depicted in Fig. 2. 




Fig.2. The graph Gnf when A: = 4, showing that Gvf(4) < ^ « 0.4643. 



Assume first, that k — x < 1. In this case \Li\ < IL 2 I + I, so Gi and G 2 can be 
colored by Next-Fit with Gk-x and Gk \ Ck-x respectively. After this, Next-Fit 




112 Lene Monrad Favrholdt and Morten Nyhave Nielsen 



will not be able to color any of the edges in Ei 2 - It is possible however, to color 
all edges in E\ U E 12 with k colors, because the subgraph of Gnf containing 



these edges is bipartite and has maximum degree k. Thus, for any x G Ck, the 



competitive ratio of Next-Fit can be no more than 



\Ei\ + \E.2\ 
\Ei\-e\Ei2\ 



k(k—x)+x^ 

k{k-x)-\-2kx 



^ k^'+kx ■ This ratio attains its minimum value of 2-\/3 — 3 when x = (-\/3 — l)k. 
Thus, by allowing arbitrarily large values of k, it can be arbitrarily close to 
2v^- 3. 

If A: — a; > 1, then |Ti| > IL 2 I + 1 and thus it is not possible to make Next-Fit 
color all edges in Gi using only Ck-x- In this case more copies of Gnf are needed. 
Let m be the smallest positive integer such that m{k — x) is a multiple of k. Then 
mx is a multiple of k as well. In general, m copies of Gnf, Gq,Gq, . . . , Gq, are 
used. A /c-coloring of the m copies of Gnf in which each color is used the same 
number of times can be obtained in the following way. In Gq, Gi is colored with 
the colors (A: — a;)(z — 1) + 1 mod k, (A: — a;)(z — 1) + 2 mod k, ... , {k — x)i mod k, 
and G 2 is colored with the remaining colors in Cfc. □ 



5 The Algorithm First-Fit 

The algorithm First- Fit (FF) is a fair algorithm. For each edge e that it is able 
to color, it colors e with the lowest numbered color possible. 

Theorem 5.1. inf CpFik) < -(vTo — 1) « 0.4805. 

fceN 9 

Proof. The adversary graph Gff of this proof is inspired by the graph Gnf- It 
is not possible, though, to make First-Fit color the subgraph G2 of Gnf with 
Cfc \Gx- Therefore, the graph is extended by the subgraph G '2 isomorphic to 
G 2 . Each vertex in R 2 is connected to exactly k — x vertices in L '2 and vice 
versa. Now, E 2 denotes the edges in G 2 and G '2 and the edges connecting them. 
Finally, 2kx new vertices are added, and each vertex in R 2 U L '2 is connected to 
k of these vertices. Let E3 denote the set of these extra edges. The graph Gff 
for A; = 4 is depicted in Fig. 3. 






R'2 L\ R\ Z /2 R2 L'2 R'2 



Fig. 3. The graph Gff when A: = 4, showing that Cff{4:) < f| ~ 0.4808. 
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If the edges in G\ and the edges connecting G 2 and G 2 are given first (one 
perfect matching at a time), followed by the edges in G 2 and G 2 (one perfect 
matching at a time), it is obvious that First-Fit will color the edges in the 
desired way. After this, First-Fit will not be able to color any more edges of 
Gff- On the other hand it is possible to fc-color the set E\ U E12 U E3 of edges. 
Thus, the competitive ratio of First-Fit can be no more than = 

^'"k(k-x^+ 2 kx^^ = S+sL ■ This ratio attains its minimum value of |(^/T0- 

1), when X = |(vT0— l)k. Thus, for the graph Gff, the optimal (from an 
adversary’s point of view) value of x is an integer as close as possible to |(-\/T0 — 
l)k, and by allowing arbitrarily large values of k, the ratio can be arbitrarily 
close to |(-\/T0 — !)• □ 



6 fc-Colorable Graphs 



Now that we know that the competitive ratio cannot vary much between different 
kinds of algorithms for the Edge-Coloring problem, it would be interesting to 
see what happens if we know something about the input graphs — for instance 
that they are all fc-colorable. In this section we investigate the competitive ratio 
in the case where the input graphs are known to be fc-colorable. 



6.1 A Tight Lower Bound for Fair Algorithms 



Theorem 6.1. Any fair algorithm for Edge-Coloring is ^-competitive on k- 
colorable graphs. 



Sketch of the Proof. As in the proof of Theorem 3.2 the idea is that each col- 
ored edge is worth one unit of some value. The value of each colored edge e is 
distributed equally among its endpoints and, from there, redistributed to the 
uncolored edges adjacent to e. If each uncolored edge receives a total value of 
at least one, then there are at least as many colored edges as uncolored edges. 
Let dc and du be defined as in the proof of Theorem 3.2. Then each uncol- 



ored edge (x,y) receives the value | 



/ dc{x) dcjy) \ ^ 1 / dc(x) 
du(y) / — 2 yk-ddx) 



dciv) \ 
k-dc{y) ) 



> 



1 / dc(x) . k-dc{x) \ ^ 

2 yk—dcix) dc.(x) J — 

i -I- = 1. The first inequality above follows from the fact that dc{x) -\- 



du{x) < k, since the graph is /c-colorable. The second inequality follows from the 
fact that dc{x) -\- dc{y) > k, since fair^ is fair. □ 



In Section 4 it is shown that, on fc-colorable graphs, the competitive ratio of 
the algorithm Next-Fit is ^ for all even k. Thus, the result in Theorem 6.1 is 
tight. 
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6.2 An Upper Bound for Deterministic Algorithms 

Theorem 6.2. For any deterministic algorithm A for Edge-Coloring, Ca < 
|, even on k-colorable graphs. 

Sketch of the Proof. The edges are given in two phases. In Phase 1, a large |- 
regular bipartite graph G = {LU R,E) is given. After Phase 1, the vertex set L 
is divided in subsets according to the colors represented at each vertex. Vertices 
with the same color sets are put in the same subset. The same is done to R. 
Now, look at one such subset S. Assume that it has size n and that the number 
of colors represented at each vertex is d. In Phase 2, ^ new vertices are added 
and connected to the vertices in S, creating a bipartite graph B in which the 
new vertices have degree k, and the vertices in S have degree |. Thus, looking 
at the whole graph, the total vertex degree of the vertices in B is |n/c. on-line^ 
can color at most k — d edges incident to each of the new vertices. Therefore, 
looking at the subgraph colored by on-line^, the total degree of the vertices in 
B is at most 2 • ^{k — d) + nd = nk. Since each of the vertices given in Phase 2 
is connected to only one of the sets L and R, the whole graph is bipartite. Thus, 
off-line colors all of the edges. □ 



6.3 The Algorithm Next-Fit 



Theorem 6.3. On k-colorable graphs, CNFfk) < 




if k is even 
if k is odd 



Proof. The adversary constructs a graph Gnf in the following way. First it 
constructs two complete bipartite graphs Gi = {LiU Ri, Ei) with \Li \ = |i?i| = 
and G2 = {L2O R2, E2) with IL2I = |i?2| = L|J- ^1 can be colored with |"|] 
colors using each color [I] times, and G2 can be colored with [|J colors using 
each color [|J times. The edges in these two graphs are given in an order such 
that Next-Fit colors Gi with and G2 with Gk \ Cft] • Now, each vertex in 
i?i is connected to each vertex in L2 and each vertex in R2 is connected to each 
vertex in Li. Let E12 denote these edges connecting G\ and G2. Next-Fit is not 
able to color any of the edges in E12. It is, however, possible to color all edges in 
GpfF with Gk, since the graph is bipartite and has maximum degree k. Thus, in 
the case where the input graphs are all fc-colorable, the competitive ratio of Next- 

\E \-\-\E \ j~ fc 1 2 I I fc 1 2 

Fit can be no more than if — r = 19 , orfci 1 fc 1 1 which reduces to 

isi 1+1^2 1+1^12 1 rfp+Lf 1^+2111 if j 

i when k is even, and to | + ^ when k is odd. □ 



6.4 The Algorithm First-Fit 

The following theorem is an immediate consequence of Lemma 6.5 and Lemma 6.6. 
Theorem 6.4. On k-colorable graphs, CFF{k) = 
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R2 L\ R\ Z /2 R2 



Fig.4. The graph Gnf when k = 5 



Thus, for small values of k, the competitive ratio of First-Fit on fc-colorable 
graphs is significantly larger than that of Next-Fit, but the difference tends to 
zero as k approaches infinity. 

Lemma 6.5. On k-colorable graphs, Cppik) > 2 k-i ■ 

Proof. Let E be the edge set of an arbitrary fc-colorable graph G. Assume that 
First-Fit is given the edges in E in some order. For c € Gk, let Ec denote the set 
of edges that First-Fit colors with the color c. We will prove by induction on c 
that, for all c G Ck, \Ei\ > 

For the base case, consider c = 1. By the definition of First-Fit, each edge 
in if \ if 1 is adjacent to at least one edge in ifi. Furthermore, since G is k- 
colorable, each edge in Ei is adjacent to at most 2{k — 1) other edges. Thus, 
|if| < 2(fc- l)|ifi| + lifil, or lifil > 2^|if|. 

For the induction step, let c G Ck- Each edge in if \ U?^;^ifi is adjacent to at 
least one edge in Ec- Moreover, since each edge in Ec is adjacent to at least c — 1 
edges in U^“j^ifi, each edge in Ec is adjacent to at most 2(fc— 1) — (c— 1) = 2k—c—\ 
edges in E\ Therefore, \Ec\ > 5^ \E \ Thus, \Ei\ > 

y^c-l|p| I \Ei\ _ \E\ + (2k-c-l)j:'r-l\Ei\ ^ \E\ + (2k-c-l) ^\E\ 

l^i=l w 2fc-c “ 2k-c — 2k-c ~ 

2jtr\E\ □ 



Lemma 6.6. On k-colorable graphs, Oppik) < ^k-i • 

Outline of the Proof. Inspired by the proof of Lemma 6.5, we construct a bipar- 
tite graph G and a First-Fit coloring of G such that all vertices have degree k 
and no edge is adjacent to more than one edge of each color. For such a graph 
the analysis in the proof of Lemma 6.5 is tight, meaning that First-Fit colors 
exactly 2 ^zi of the edges in G. Since G is bipartite, it can be fc-colored off-line. 
□ 

7 Conclusions 

We have proven that the competitive ratios of algorithms for Edge-Coloring 
can vary only between approximately 0.46 and 0.5 for deterministic algorithms 
and between 0.46 and 0.57 for probabilistic algorithms (it can, of course, be 
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lower for algorithms that are not fair). Thus, we cannot hope for algorithms 
with competitive ratios much better than those of Next-Fit and First-Fit. In the 
case of /c-colorable graphs the gap is somewhat larger: the (tight) lower bound 
for fair algoritms is ^ and the upper bound for deterministic algorithms is 
In this case we have no upper bound on the competitive ratio for probabilistic 
algorithms. 

We have shown that the performance of Next-Fit matches the lower bound 
on the competitive ratio in both the general case and in the special case of 
fc-colorable graphs. Furthermore, we have found the exact competitive ratio of 
First-Fit on /c-colorable graphs. For small values of k it is significantly better 
than that of Next-Fit, but for large values of k they can hardly be distinguished. 
In the general case, First-Fit is at most 0.016 better than Next-Fit. We believe 
that the competitive ratio of First-Fit is larger than that of Next-Fit but we 
have not proven it. 
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Abstract. We investigate polynomial-time approximability of the prob- 
lems related to edge dominating sets of graphs. When edges are unit- 
weighted, the edge dominating set problem is polynomially equivalent to 
the minimum maximal matching problem, in either exact or approximate 
computation, and the former problem was recently found to be approx- 
imable within a factor of 2 even with arbitrary weights. It will be shown, 
in contrast with this, that the minimum weight maximal matching prob- 
lem cannot be approximated within any polynomially computable factor 
unless P—NP. 

The connected edge dominating set problem and the connected vertex 
cover problem also have the same approximability when edges/ vertices 
are unit-weighted, and the former problem is known to be approximable, 
even with general edge weights, within a factor of 3.55. We will show 
that, when general weights are allowed, 1) the connected edge domi- 
nating set problem can be approximated within a factor of 3 -I- e, and 
2) the connected vertex cover problem is approximable within a fac- 
tor of Inn -b 3 but cannot be within (1 — e) Inn for any e > 0 unless 
NP C DTIME(n°('°®'°*")). 

1 Introduction 

In this paper we investigate polynomial-time approximability of the problems 
related to edge dominating sets of graphs. For two pairs of problems consid- 
ered, it will be shown that, while both problems in each pair have the same 
approximability for the unweighted case, they have drastically different ones 
when optimized under general non-negative weights. 

In an undirected graph an edge dominates all the edges adjacent to it, and an 
edge dominating set (eds) is a set of edges collectively dominating all the other 
edges in a graph. The problem EDS is then that of finding a smallest eds or, 
if edges are weighted, an eds of minimum total weight. Yannakakis and Gavril 
showed that EDS is JVP-complete even when graphs are planar or bipartite of 
maximum degree 3 [24]. Horton and Kilakos extended this JVP-completeness 
result to planar bipartite graphs, line and total graphs, perfect claw-free graphs, 
and planar cubic graphs [14] . A set of edges is called a matching (or independent) 
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if no two of them have a vertex in common, and a matching is maximal if 
no other matching properly contains it. Notice that any maximal matching is 
necessarily an eds, because an edge not in it must be adjacent to some in it, 
and for this reason it is also called an independent edge dominating set, and 
the problem lEDS asks for computing a minimum maximal matching in a given 
graph. Certainly, a smallest maximal matching cannot be smaller than a smallest 
eds. Interestingly, one can construct a maximal matching, from any eds, of no 
larger size in polynomial time [13], implying that the size of a smallest eds equals 
to that of a smallest maximal matching in any graph. Thus, EDS and lEDS are 
polynomially equivalent, in exact or approximate computation, when graphs are 
unweighted. Based on this and the fact that any maximal matching cannot be 
more than twice larger than another one, it has been long known that either 
problem, without weights, can be approximated within a factor of 2, and even 
with weights, EDS was very recently shown approximable within a factor of 

2 [4,7]. We will present, in contrast with this, strong inapproximability results 
for weighted lEDS. 

We next consider EDS with connectivity requirement, called the connected 
edge dominating set ( CEDS) problem, where it is asked to compute a connected 
eds (ceds) of minimum weight in a given connected graph. Since it is always 
redundant to form a cycle in a ceds, the problem can be restated as that of going 
after a minimum tree whose vertices “covers” all the edges in a graph, and thus, 
it is also called tree cover. Although enforcing the independence property on EDS 
solutions does not alter (increase) their sizes as stated above, the connectivity 
condition certainly does (just consider a path of length 5). The vertex cover 
(VC) problem is another basic JVP-complete graph problem [16], in which a 
minimum vertex set is sought in G s.t. every edge of G is incident to some 
vertex in the set, and when a vertex cover is additionally required to induce a 
connected subgraph in a given connected graph, the problem is called connected 
vertex cover (CVC) and known to be as hard to approximate as VC is [8]. These 
problems are closely related to EDS and CEDS in that an edge set F is an 
eds for G iff V{F), the set of vertices touched by edges in F, is a vertex cover 
for G, and similarly, a tree F is a ceds iff V{F) is a connected vertex cover. 
Since one can easily obtain a cvc of size ]Fj + 1 from a ceds F (tree), and 
conversely, a ceds of size [Cj — 1 from a cvc C, these two problems have the same 
approximability for the unweighted case. The unweighted version of CEDS or 
CVC is also known to be approximable within a factor of 2 [22,2] . It is not known, 
however, if CEDS and CVC can be somehow related even if general weights are 
allowed, and the algorithm scheme of Arkin et al. for weighted CEDS gives 
its approximation factor in the form of rst + rwvc(l + l/^)> for any constant 
k, where rst(rwvc) is the performance ratio of any polynomial time algorithm 
for the Steiner tree (weighted vertex cover, resp.) problem [2]. By using the 
currently best algorithms for Steiner tree with rst = 1 + In 3/2 « 1.55 [21] and 
for weighted vertex cover with r^vc = 2 — log log n/ logn [3] in their scheme, the 
bound for weighted CEDS is estimated at 3.55. After improving this bound to 

3 + e, we will show that weighted CVC is as hard to approximate as weighted 
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set cover is, indicating that it is not approximable within a factor better than 
(1 — e) Inn unless NP C DTIME(n‘^^^°®*°®”^) [6]. Lastly, we present an algorithm 
approximating weighted CVC within a factor of r^vc + (A— 1) <ln(Z\— l) + 3, 

where ff(k) is the fcth Harmonic number and A is the maximal vertex degree of 
a graph. 

Since EDS is exactly the (vertex) dominating set problem on line graphs, it is 
worth comparing our results with those for independent/connected dominating 
set problems. The connected dominating set is as hard to approximate as set 
cover is, but can be approximated within a factor of In n + 3 for the unweighted 
case [9], and within 1.35 Inn for the weighted case [10]. The independent domi- 
nating set problem (also called minimum maximal independent set), on the other 
hand, cannot be approximated, even for the unweighted case, within a factor of 
n^“*^ for any constant e > 0, unless P=NP [12]. 

2 Independent Edge Dominating Set 

To show that it is extremely hard to approximate lEDS, let us first describe a 
general construction of graph for a given 3SAT instance (i.e., a CNF formula) 
(j>, by adapting the one used in reducing SAT to minimum maximal independent 
set [15,12] to our case. For simplicity, every clause of (j) is assumed w.l.o.g. to 
contain exactly three literals. Each variable Xi appearing in (j) is represented in 
G,p by two edges adjacent to each other, and the endvertices of such a path 
of length 2 are labeled Xi and xp, let Ey denote the set of these edges. Each 
clause Cj of (j) is represented by a triangle (a cycle of length 3) Cj, and vertices 
of Cj are labeled distinctively by literals appearing in Cj] let Ec denote the set 
of edges in these disjoint triangles. The paths in Ey and triangles in Ey are 
connected together by having an edge between every vertex of each triangle and 
the endvertex of a path having the same label. The set of these edges lying 
between Ey and Ey is denoted by E^. It is a simple matter to verify that, for 
a 3SAT instance 4> with m variables and p clauses, G^ constructed this way 
consists of 3(m +p) vertices and 2(m + 3p) edges. 

Lemma 1. Let M{G) denote a minimum maximal matching M in G. For any 
3SAT instance 4> with m variables and p clauses, and for any number t, there 
exists a graph Gj, on 3(m + p) vertices and 2(m + 3p) edges, and a weight 
assignment w \ E ^ {l,i\ such that 

f<m + 0 if 6 is satisfiable 

Proof. Let w{e) = 1 if e G Ey U Ey and w{e) = t ii e G Et. Suppose that 4> is 
satisfiable, and let r be a particular truth assignment satisfying (j). Construct a 
matching My in Ey by choosing, for each i, the edge with its endvertex labeled 
by Xi if rfxi) is true and the one having an endvertex labeled by Xi if rfxi) is 
false. Consider any triangle Cj in Ey. Since t satisfies 4>, at least one edge among 
those in Ei, connecting Cj and Ey must be dominated by My. This means that 
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all the edges in Et between Cj and Ey can be dominated by My, plus one edge 
on Cj. Let My denote the set of such edges, each of which taken this way from 
each Cj. Then, U Me is clearly a minimal matching since it dominates all 
the edges in Cj>. Since all the edges in My U Me are of weight 1, its weight is 
\Mr U Me I = TO + p. On the other hand, if <j) is not satisfiable, there is no way 
to dominate all the edges in Ef, only by any matching built inside Ey U Ee, and 
hence, any maximal matching in Gj, must incur a cost of more than t. □ 

The computational hardness of approximating weighted lEDS easily follows 
from this lemma: 

Theorem 1. For any polynomial time computable function a(ji), lEDS cannot 
he approximated on graphs with n vertices within a factor ofa{n), unless P=NP. 

Proof. Given a 3SAT instance (j) with to variables and p clauses, construct a 
graph Gj) and assign a weight w{e) G {l,t} to each edge e G if, as in the proof 
of Lemma 1. Since Gj, consists of 3 (to + p) vertices and (jn + p)a{3(jn + p)) 
is computable in time polynomial in the length of to + 3p, we can set t = 
(to + p)a{3{m + p)) = (to + p)a{n). If a polynomial time algorithm A exists 
approximating lEDS within a factor of a(n), then, when applied to Gj,, A will 
output a number at most (to + p)a{n) if (j) is satisfiable, and a number greater 
than t = {m+p)a{n) if (j) is not satisfiable. Hence, A decides 3SAT in polynomial 
time. □ 

It is additionally pointed out in this section that lEDS is complete for exp- 
APX, the class of NP optimization problems polynomially approximable within 
some exponential factors [1], by slightly modifying the construction used in 
Lemma 1 and allowing zero weight on edges. Previously, a more general prob- 
lem, minimum weight maximal independent set, was shown to be exp-APX- 
complete [5]. 

Definition 1. Minimum Weighted Satisfiability (MinWSAT) is the problem de- 
fined by 

Instance: a CNF formula (f> with nonnegative weights w{x) on the variables 
appearing in 4>. 

Solution: truth assignment t , either satisfying (j) or setting all the variables to 
“true”. The latter is called a trivial assignment. 

Objective: minimize w { t ) = I]r(a;)=t™e^(^)- 

Theorem 2. The weighted lEDS problem is complete for exp-APX. 

Proof. MinWSAT is known to be exp-APX-complete [19], and we reduce it to 
weighted lEDS. Given <j) and weights w on its variables, define Gj, as before and 
edge weights w' such that 

! w{xi), A e G Ey and its endvertex is labeled by Xi 
w(a:i), if e G Eb 
0, otherwise . 




On Approximability of the Independent/Connected Edge 121 



If a maximal matching in contains no edges in Eh, it must have an edge in 
Ey labeled either Xi or Xi for each i. So, from an MinWSAT instance {(j), w) and 
a maximal matching M in G^, a truth assignment r for (j) can be recovered in 
such a way that 

_ J truth assignment corresponding to M Ey, if M n iff, = 0 
1 trivial assignment, otherwise . 

It is then straightforward to verify that any algorithm for lEDS with performance 
guarantee of a can be used to approximate MinWSAT within a factor at most 
a. □ 

3 Connected Edge Dominating Set 

We first consider a restricted version of CEDS; for a designated vertex r called 
root, an r-ceds is a ceds touching r, and the problem r-CEDS is to compute an 
r-ceds of minimum weight. Given an undirected graph G = (V, E) with edge 
weights w : E ^ Q_i_, let G = (V,E) denote its directed version obtained by 
replacing each edge {u,u} of G by two directed ones, (u,v) and {v,u), each of 
weight w{{u, u}). For the root r, a non-empty set S C V — {r} is called dependent 
if S is not an independent set in G. Suppose T C if is an r-ceds, and let T 
denote the directed counterpart obtained by choosing, for each pair of directed 
edges, the one directed away from the root to a leaf. Clearly, w{T) = w{T). 
Moreover, let T be represented by its characteristic vector x'^ G {0, 1}-®, and, 
for any x G Q'® and F C E, let x{F) = J2a&F Then, x'^ satisfies the linear 
inequality x{S~{S)) > 1 for all dependent sets S CV, where S~{S) = {{u,v) G 
E I V G S,u ^ S'}, because, when an edge exists inside S, at least one arc of T 
must enter it. Thus, the following linear programming problem is a relaxation 
of r-CEDS: 



Zceds = min J2aeE w{a)xa 

s.t. 

x{S~{S)) >1 V dependent set S C E 

0 < < 1 Wa G E 

Lemma 2. For any feasible solution x G Q}'® of (1), let V+{x) = {u G V \ 
x((5“({m})) > 1/2}. Then, V+{x) U {r} is a vertex cover for G. 

Proof. Take any edge e = {u, u} G E, and assume r ^ e. Then, {m, u} is a depen- 
dent set, and a;(i5“({u, u})) > 1, which implies either a;(<5“({M})) or a;(i5“({u})) 
is at least 1/2. Thus, {m, u} n V+(a:) yf 0. □ 

From this lemma it is clear that any tree T G E containing all the vertices 
in V+{x) U {r} is an r-ceds for G, and, in searching for such T of small weight, 
it can be assumed w.l.o.g. that the edge weights satisfy the triangle inequality 
since any edge between two vertices can be replaced, if necessary, by the shortest 
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path between them. Then, the problem of finding such a tree of minimum weight 
is called the (metric) Steiner tree problem: Given G = (V,E) with edge weight 
w : E ^ and a set R CV of required vertices (or terminals), find a minimum 
weight tree containing all the required vertices and any others (called Steiner 
vertices). For this problem Rajagopalan and Vazirani considered the so called 
bidirected cut relaxation [20]: 

Zsmt = minX^aeB 

s.t. 

x{5~{S)) >1 V valid set S' C R 

0 < Xa < 1 Wa € E 

where the root r is any required vertex and a set S C V — {r} is valid if it 
contains a required vertex. Based on this relaxation, they designed a primal-dual 
approximation algorithm for metric Steiner tree and showed that it computes 
a Steiner tree of cost at most (3/2-1- e)Zsmt and that the integrality gap of (2) 
is bounded by 3/2, when restricted to graphs in which Steiner vertices form 
independent sets (called quasi-bipartite graphs). Our algorithm for r-CEDS is 
now described as: 

1. Compute an optimal solution x for (1). 

2. Let V+{x) = {u e V \ x(S ({u})) > 1/2}. 

3. Compute a Steiner tree T with R = V+(x) U {r|, the set of required 

vertices, by the algorithm of Rajagopalan and Vazirani. 

4. Output T. 

It is clear that this algorithm computes an r-ceds for G, except for one special 
case in which R = {rj and so, T = 0; but then, it is trivial to find an optimal 
r-ceds since G is a star centered at r. Not so clear from this description is 
polynomiality of its time complexity, and more specifically, that of Step 1. It 
can be polynomially implemented by applying the ellipsoid method to (1), if the 
separation problem for the polytope Pceds corresponding to the feasible region 
of (1), is solved in polynomial time [11]. So, let y be a vector in (Q'®. It is easily 
tested if 0 < ya < 1 for all a G E. To test whether y{S~{S)) > 1 for every 
dependent set S, we consider y as & capacity function on the arcs of G. For 
every arc a, not incident upon r, contract a by merging its two endvertices into 
a single vertex Va, and determine an (r, Ua)-cut Ca of minimum capacity by, say 
the Ford-Fulkerson algorithm. It is then rather straightforward to see that 

min{y(Ga) \ a £ E — i5({r})} = min{y{S~ (S)) | S' C V — {rj is dependent} 

where i5({r}) is the set of arcs incident to r. So, by calculating \E — <5({r})| 
minimum capacity (r, Ua)-cuts, we can find a dependent set S of minimum cut 
capacity y{S~{S)). If y{S~{S)) > 1, we thus conclude that y G Pceds, while, if 
not, the inequality x{6~ (S)) > 1 is violated by y and a separation hyperplane is 
found. 
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Notice that our graph G is quasi-bipartite when V+{x) U {r} is taken as the 
set of required vertices since it is a vertex cover for G, and for the approximation 
quality of solutions, we have^ 

Theorem 3. The algorithm above computes an r-ceds of weight at most (3 + 
c)-^ceds- 

Proof. Let x € be an optimal solution of (1), and T be an r-ceds computed 
by the algorithm. As mentioned above, it was shown that w(T) < (3/2-1- e)Zsmt 
when graphs are quasi-bipartite [20]. So, it suffices to show that 2x is a feasible 
solution of (2) with R = V+{x) U {r} for then, Zsmt < 2 w{a)xa = 2Zceds, 

and hence, w{T) < 2(3/2-|-e)Zceds- To this end, let S C V — {r} be any valid set. 
If S is not an independent set in G, it is dependent, ensuring that x{S~ (S)) > 1. 
Suppose now S is an independent set. Since it is a valid set, S contains a vertex 
r) in V+{x). But then, x((5“({u})) > 1/2, and, since S is an independent 
set in G, 2x(i5“(S')) > 2x((5“({u})) > 1. Thus, in either case, 2x satisfies all the 
linear constraints of (2). □ 

Since the integrality gap of (2) is bounded by 3/2 for quasi-bipartite graphs, we 
have 

Corollary 1. The integrality gap of (1) is bounded by 3. 

Lastly, since any ceds is an r-ceds for some r £ V , hy applying the algorithm 
with r = u for each u € V and taking the best one among all computed, CEDS 
can be approximated within a factor of 3 -I- e. 



4 Connected Vertex Cover 

Savage showed that non-leaf vertices of any depth first search tree form a vertex 
cover of size at most twice the smallest size [22] . Since such a vertex cover clearly 
induces a connected subgraph, it actually means that a cvc of size no more than 
twice larger than the smallest vertex cover always exists and can be efficiently 
computed. When vertices are arbitrarily weighted, however, the weighted set 
cover problem can be reduced to it in an approximation preserving manner, as 
was done for node- weighted Steiner trees [18] and connected dominating sets [9]: 



Theorem 4. The weighted set cover problem can be approximated within the 
same factor as the one within which weighted CVC can be on bipartite graphs. 

Proof. From a set cover instance {U,T) and w : !F ^ where T G2P and 
UsgjpS' = U, construct a bipartite graph G as a CVC instance, using a new 
vertex c, with vertex set {U U {c}) U T s.t. an edge exists between c and every 
S & T, and between u G U and S' € IF iff m e S'. All the vertices in U and c are 

^ Independently of our work, Koenemann et al. recently obtained the same perfor- 
mance guarantee by the essentially same algorithm [17]. 
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assigned with zero weights, while every vertex S & T inherits w(S), the weight 
of set S, from (U,T). 

For a vertex subset V of G let r{V) denote the set of vertices adjacent to 
a vertex in V' . Clearly, IF' C is a set cover for {U,T) iff C/ C r(lF') in G, 
and moreover, for any set cover T' , IF' U C/ U {c} is a cvc of the same weight. 
On the other hand, for any cvc C iov G, U C F(G n IF), i.e., G n IF is a set 
cover, of the same weight because, if not and u ^ F(G n IF) for some u G U, 
F({u}) n G = 0, and hence, there is no way to properly cover an edge incident to 
u by G. Thus, since it costs nothing to include c and vertices in U, any cvc for G 
can be assumed to be in the form of IF' U C/ U {c} s.t. T' is a set cover for {U,!F), 
with its weight equaling to that of T' . Therefore, any algorithm approximating 
CVC within a factor r can be used to compute a set cover of weight at most r 
times the optimal weight. □ 

Due to the non-approximability of set cover [6], it follows that 

Corollary 2. The weighted CVC cannot he approximated in a factor better than 
(1 - e) Inn for any e > 0, unless NP C DTIME{n^<^'°siogn)y 

One simple strategy for approximating weighted CVC, which turns out to 
yield a nearly tight bound, is to compute first a vertex cover G C V for G = 
(y, E), and then to augment it to become connected by an additional vertex set 
D G V — C . While many good approximation algorithms are known for vertex 
cover, we also need to find such D of small weight. This problem is not exactly 
same as but not far from the weighted set cover, because it can be seen as a 
specialization of the suhmodular set cover problem [23], which in general can 
be stated simply as min£)CAr{w(D) | f{D) = f{N)}, given (N,f) where fV is a 
finite set and f : 2^ ^ IR+ is a nondecreasing, submodular set function on N. 
For our case, take N = V — G, and f{D) = k{C) — k{C U D) defined on V — G, 
where k{F) denotes the number of connected components in the subgraph G[F'] 
induced by F. Then, using the fact that V — G is an independent set in G, it can 
be verified that / thus defined is indeed nondecreasing and submodular. Also 
notice that G[GUD] is connected iff f{D) = k{C) — 1 = f{V — G). This way, the 
problem of computing minimum D G V — G such that G[G U D] is connected, 
is formulated exactly by the submodular set cover problem for (V — G, f). 

The greedy algorithm for submodular set cover, adapted to our case, is now 
described as: 

1. Initialize D <r~ %. 

2. Repeat until G[GU D] becomes connected. 

3. Let u be a vertex minimizing w{v)/{f{D U {u}) — f{D)) among 
uG V-C. 

4. Set D ^ D\J {m}. 

5. Output D. 

It was shown by Wolsey that the performance of the greedy algorithm for sub- 
modular set cover generalizes the one for set cover: 
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Theorem 5 ([23]). The greedy algorithm for submodular set cover computes a 
solution of weight hounded by Hfinaxj^N f{{j})) times the minimum weight. 

Since max^g^v /({j}) < — 1, in our case, for a graph of maximal vertex degree 

A, the greedy heuristic works with an approximation factor bounded by H{A — 
1) < l + ln(Z\- 1). 

Theorem 6. The algorithm above computes a eve of weight at most r^vc + 
H{A-1) < ln(Z\ — 1) + 3 times the minimum weight. 

Proof. Let C* be an optimal cvc and CUD be the one computed by the algorithm 
above for G, where C is a vertex cover of weight at most twice that of the 
minimum vertex cover, and D is the greedy submodular set cover for (V — C,f). 
Clearly, w{C) < 2w{C*). Observe that G[C* U G] remains connected because 
any superset of a cvc is still a cvc. But then, it means that G* — C QV — C \s 
a submodular set cover for (V — C, f). We thus conclude that 

w{G UD)< 2w{C*) + H{A - l)w(C* - C) < (2 + H{A - l))w{G*) . 



□ 
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Abstract. A pushdown system is a graph G{P) of configurations of a 
pushdown automaton P. The model cheeking problem for a logic L is: 
given a pushdown automaton P and a formula a £ L decide if a holds in 
the vertex of G(P) which is the initial configuration of P. Computation 
Tree Logic (CTL) and its fragment EF are considered. The model check- 
ing problems for CTL and EF are shown to be EXPTIME-complete and 
PSPACE-complete, respectively. 

1 Introduction 

A pushdown system is a graph G{P) of configurations of a pushdown automaton 
P. The edges in this graph correspond to single steps of computation of the 
automaton. The pushdown model checking problem (PMC problem) for a logic 
L is: given a pushdown automaton P and a formula a £ L decide if a holds 
in the vertex of G{P) which is the initial configuration of P. This problem is 
a strict generalization of a more standard model checking problem where only 
finite graphs are considered. 

In this paper we consider PMC problem for two logics: CTL and EF. CTL is 
the standard Computation Tree Logic [4, 5] . EF is a fragment of CTL containing 
only operators: exists a successor (3oa), and exists a reachable state (3Fa). 
Moreover, EF is closed under conjunction and negation. We prove the following: 

~ The PMC problem for EF logic is PSPACE-complete. 

— The PMC problem for CTL is EXPTIME-complete. 

The research on the PMC problem continues for some time. The decidability 
of this problem for monadic second order logic (MSOL) follows from [8] (for 
a simpler argument see [2]). This implies decidability of the problem for all 
those logics which have effective translations to MSOL. Among them are the /x- 
calculus, CTL* as well as the logics considered here. This general result however 
gives only nonelementary upper bound on the complexity of PMC. In [9] an 
EXPTIME-completeness of PMC for the ^-calculus was proved. This result was 
slightly encouraging because the complexity is not that much bigger than the 
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complexities of known algorithms for the model checking problem over finite 
graphs. In [1] it was shown that the PMC problems for LTL and linear time 
/r-calculus are EXPTIME-complete. 

The PMC problem for EE was considered already in [1] . It was shown there 
that the problem is PSPACE-hard. Moreover it was argued the the general 
method of the paper gives a PSPACE algorithm for the problem. Later, a closer 
analysis showed that there is no obvious way of implementing the method in 
polynomially bounded space [6]. The algorithm presented here follows the idea 
used in [9] for the /x-calculus. 

The EXPTIME hardness results for the alternation free /r-calculus and LTL 
show two different reasons for the hardness of the PMC problem. One is un- 
bounded alternation, the other is the ability to compare two consecutive blocks 
of states on a path. The reachability problem for pushdown systems is of course 
solvable in polynomial time (see [7] for a recent paper on this problem). Over 
finite graphs the model checking problem for CTL reduces to a sequence of 
reachability tests. This suggested that PMC problem for CTL may be PSPACE- 
complete. In this light EXPTIME-hardness result is slightly surprising. The ar- 
gument combines ideas from the hardness results for the /r-calculus and LTL. It 
essentially shows that 3Ga operator (there is a path on which a always holds) 
is enough to obtain EXPTIME-hardness. This result makes EE logic more in- 
teresting as it is a fragment of CTL that disallows 3Ga but still allows VGa (for 
all paths a always holds). 

Next section gives definitions concerning logics and pushdown systems. Sec- 
tion 3 presents an assumption semantics of EE. This semantics allows to for- 
mulate the induction argument in the correctness proof of the model check- 
ing algorithm. The proof is described in Section 4. The final section presents 
EXPTIME-hardness result of the PMC problem for CTL. 

2 Preliminaries 

In this section we present CTL and EE logics. We define pushdown systems and 
the model checking problem. 

CTL and EF logics Let Prop be a set of propositional letters; let p,p', ■ . . range 
over Prop. 

The set of formulas of EF logic, Form(EF), is given by the grammar: a ::= 
p I I a A /? I 3oa | 3Fa. For CTL the grammar is extended with the clauses: 

3{aiU 012 ) I 3^{aiUa2). 

The models for the logic are labelled graphs {V,E,p); where V is the set of 
vertices, E is the edge relation and p : Prop T’{V) is a labelling function 
assigning to each vertex a set of propositional letters. Such labelled graphs are 
called transition systems here. In this context vertices are also called states. 

Let M = (y, E, p) be a transition system. The meaning of a formula a in a 
state V is defined by induction. Tha clauses for propositional letters, negation 
and conjunction are standard. For the other constructs we have: 
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— M, V 1= Boa if there is a successor v' of v such that M, u' 1= a. 

— M, V 1= 3Fa if there is a path from v to v', s.t. M, v' 1= a. 

— M,v 1= 3{alip) if there is a path from v to v', s.t. M,v' 1= j3 and for all the 
verticies v" on the path other than v' we have M, v" a. 

— M, V \= 3^(aU P) if there is a maximal (i.e., infinte or finite ending in a vertex 
without successors) path tt from v s.t. for every vertex u' on tt with M, v' \= P 
there is an earlier vertex v" on tt with not M, v" 1= a. 

We will freelly use abbreviations: 

ay P = A “1/3) Voa = VGa = -BBE^a 

Using these one can convert every formula of EF logic to an equivalent positive 
formula where all the negations occur only before propositional letters. 

Pushdown systems A pushdown system is a tuple P = {Q, P, Z\, qo, _L) where Q 
is a finite set of states, T is a finite stack alphabet and A C (Q x P) x {Q x P*) 
is the set of transition rules. State go G Q is the initial state and symbol 3- G P 
is the initial stack symbol. 

We will use q, z, w to range over Q, P and P* respectively. We will write 
qz q'w instead of {{q, z), {q' , w)) G A. We will omit subscript A if it is clear 
from the context. 

In this paper we will restrict ourselves to pushdown systems with transition 
rules of the form qz ^ q' and qz ^ qz'z. Operations pushing more elements 
on the stack can be simulated with only polynomial increase of the size of a 
pushdown system. We will also assume that _L is never taken from the stack, 
i.e., that there is no rule of the form qP ^ q' for some q, q' . 

Let us now give the semantics of a pushdown system P = {Q,P,A,qo,P). 
A configuration of P is a word qw G Q x P*. The configuration go-L is the 
initial configuration. A pushdown system P defines an infinite graph G{P) which 
nodes are configurations and which edges are: {qzw, q'w) G E ii qz q' , and 
(qzw,q'z'zw) G E if qz q'z'z; for arbitrary w G P*. 

Given a valuation p : Q ^ V{Prop) we can extend it to Q x P* by putting 
p{qw) = p{q). This way a pushdown system P and a finite valuation p define a, 
potentially infinite, transition system M(P,p) which graph is G{P) and which 
valuation is given by p as described above. 

The model checking problem is: 

given P, p and ip decide if M(P, p), qo± 1= ip 

Please observe that the meaning of ip in the initial configuration q^P depends 
only on the part of M{P,p) that is reachable from qoP. 

3 Assumption Semantics 

For this section let us fix a pushdown system P and a valuation p. Let us ab- 
breviate M{P,p) by M. 
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We are going to present a modification of the semantics of EF-logic. This 
modified semantics is used as an induction assumption in the algorithm we are 
going to present later. From the definition of a transition system M it follows 
that there are no edges from vertices q, i.e., configurations with the empty stack. 
We will look at such vertices not as dead ends but as places where some parts 
of the structure where cut out. We will take a function S : Q ^ V{Form{EF)) 
and interpret S{q) as an assumption that in the vertex q formulas S{q) hold. 
This view leads to the following definition. 

Definition 1. Let S : Q ^ V{Form{EF)) be a function. For a vertex v ofG{P) 
and a formula a we define the relation M,v Ng a as the least relation satisfying 
the following conditions: 

— M, q \=s a for every a G S{q). 

— M, V \=s p if P G p{v). 

— M, V \=s Oi A P if M, V \=s Oi and M, v Ng p. 

— M, V \=s OiM P if M, V \=s Oi or M, v Ng p. 

— M, u Ng 3oa for v ^ Q if there is a successor v' of v such that M, v' \=s a. 

~ M, V Ng Voa for v ^ Q if for every successor v' ofv we have that M, v' \=s oc. 

— M,v Ng 3Fa if there is a path from v to v' , s.t. M,v' \=s a or v' = q for 

some state q G Q and 3Fa G S{q). 

— M, v Ng VGa iff for every path tt from v which is either infinite or finite 
ending in a vertex from Q we have that M, v' Ng a for every vertex v' of tt 
and moreover if tt is finite and ends in a vertex q G Q then VGa G S{q). 

Of course taking arbitrary S in the above semantics makes little sense. We 
need some consistency conditions as defined below. 

Definition 2. A set of formulas B is saturated if 

— for every formula a either a G B or G B but not both; 

— if a G B and P G B then a A P G B; 

— if a G B then a\/ P G B and p\/ a G B for arbitrary P; 

— if a G B then 3Fa G B. 



Definition 3 (Assumption function). A function S : Q ^ V{Form{EF)) is 
saturated if S{q) is saturated for every q G Q. A function S is consistent with 
p if S{q) n Prop = p{q) for all q G Q. We will not mention p if it is clear from 
the context. We say that S is an assumption function (for p) if it is saturated 
and consistent. 



Lemma 1. For every assumption function S and every vertex v of M: M,v Ng 
a iff not M, v Ng ~^a. 

The next lemma says that the truth of a depends only on assumptions about 
subformulas of a. 
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Definition 4. For a formula a, let cl(a) be the set of subformulas of a and 
their negations. 

Lemma 2. Let a be a formula. Let S, S' be two assumption functions such that 
S{q) n cl(a) = S'{q) C cl(a) for all q € Q. For every v we have that: M,v Ng a 
iff M,v \=s' a. 

We have asumed that the initial stack symbol _L cannot be taken from the 
stack. Hence no state q is reachable from configuration <7o-L- In this case our 
semantics is equivalent to the usual one: 

Lemma 3. For arbitrary S and a we have M, go-L ce iff M, go-L 1= ex. 

We finish this section with a composition lemma which is the main property 
of our semantics. We will use it in induction arguments. 

Definition 5. For a stack symbol z and an assumption function S we define 
the function S by: S |z (<z0 = {P ■ P}> f^''" G Q- 

Lemma 4 (Composition Lemma). Let a be a formula, z a stack symbol and 
S an assumption function. Then S is an assumption function and for every 
configuration qwz' reachable from qz' we have: 

M, qwz' z l=s a iff M, qwz' a 

4 Model Checking EF 

As in the previous section let us fix a pushdown system P and a valuation p. 
Let us write M instead of M{P, p) for the transition system defined by P and p. 

Instead of the model checking problem we will solve a more general problem 
of deciding if M, qz Ng (3 holds for given g, z, S and ( 3 . A small difficulty here 
is that S is an infinite object. Fortunately, by Lemma 2 to decide if M, qz Ng (3 
holds it is enough to work with S restricted to subformulas of ( 3 , namely with 
S\p defined by S\p{q') = S{q') n cl(/3) for all q' £ Q. In this case we will also say 
that S is extending S\p. 

Definition 6. Let a be a formula, q a state, z a stack symbol, and S : Q ^ 
V{Form{EF)) a function assigning to each state a subset o/cl(a). We will say 
that a tuple {a, q, z, S) is good if there is an assumption function S such that 
S'Iq = S and M,qz Ng a. 

Below we describe a procedure which checks if a tuple {a, q, z, S) is good. It 
uses an auxiliary procedure Search(g, z, q') which checks whether there is a path 
from the configuration qz to the configuration q' . 

— Check(p, g, z, S') = 1 if p G p(<?); 

— Check(a A [ 3 , g, z, S) = 1 if Check(o;, g, z, S) = 1 and Check(/3, g, z, S) = 1; 
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— Check(^a, q, z, S) = I if Check(a, q, z, S) = 0; 

— Check(3oa, q, z,S) = 1 if either 

• there is qz ^ q' and a G S{q'); or 

• there is qz ^ q'z'z and Check{a,q' , z' , S') = 1, where S' is defined by: 
s' {q") = {(3 G cl(a) : Check(/3, q” , z, S) = 1}, for all q” G Q. 

— Check(3Fa, q, z,S) = I if either 

• Check(a, g, 2 , S') = 1; or 

• there is qz ^ q' and 3Fa G S(q'); or 

• there is qz ^ q'z'z and q" G Q for which Search(g', z', g") = 1 and 
Check(3_Fa, g", z, S) = 1; or 

• there is qz ^ q'z'z with Check(3_Fa, g', z', S ) = 1 for S defined by: 
S'(g") = {3Fa : Check(a, g", z, S) = 1} U {^3Fa : Check(a, g", z, S) = 
0} U {/? G cl(a) : Check(/3, q" , z, S) = 1}, for all q" G Q. 

— In other cases Check(a, g, z, S) = 0. 

— Search(gi, z, g 2 ) = 1 if either 

• there is giz ^ g 2 ; or 

• there is giz ^ g^z'z and q '2 G Q for which Search(g), z', g^) = 1 and 
Search(g 2 , z, g 2 ) = 1. 



Lemma 5. We have Search(gi, z, q 2 ) = I iff there is a path from the configura- 
tion giZ to the configuration g 2 - The procedure can he implemented on a Turing 
machine working in 0(|Qp|S|) time and space. 

Proof 

The proof of the correctness of the procedure is easy. The procedure can be 
implemented using dynamic programming. The implementation can construct a 
table of all good values (gi,z,g 2 ). □ 



Lemma 6. Procedure Check(o;, g, z, S) can he implemented on a Turing ma- 
chine working in Sp(|a|) = 0((|a| log(|(5|)|(5||T|)^) space. 

Proof 

The proof is by induction on the size of a. All the cases except for a = 3F/3 are 
straightforward. 

For a = 3FP consider the graph of exponential size which nodes are of the 
form Check(3F'/3, g, z, S) for arbitrary g, z, S. The edges are given by the rules: 

— Check(3F/3, gi, z, S) ^ Check(3F'/3, g 2 , z, S) whenever giZ ^ q[z'z and 
Search(g'j, z', g 2 ) = 1; 

- Check(3F/3, gi, zi, Si) ^ Check(3F/3, g 2 , Z 2 , S 2 ) if giZi ^ q 2 Z 2 Zi and S 2 
is defined by S 2 (g") = {P G cl(o;) : Check(/3, g", Zi, Si) = 1} U {^3Fa : 
Check(a, q", zi, Si) = 0} U {3Fa : Check(a, q", Zi, Si) = 1} 
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Observe that by induction assumption we can calculate whether there is an edge 
between two nodes using space S'p(|/3|). A node Check(3_F/3, q, z, S) is successful 
if either Check(/3, q, z,S) = 1 or there is qz ^ q' with 3FP G S{q'). 

It is easy to see that Check(3F/3, q, z,S) = 1 iff in the graph described above 
there is a path from the node Check(3_F/3, q, z, S) to a successful node. 

We need 0(log(|(5|)|r||Q||/3|) space to store a node of the graph. So we 
need 0((log(|Q|)|r||(3||/3|)^) space to implement Savitch algorithm performing 
deterministic reachability test in this graph. We also need <S'(|/3|) space for an 
oracle to calculate edges and 0(|Qp|0|) space for Search procedure. All this fits 
into Sp{\3Ff3\) space. □ 

Remark: It does not seem that this lemma follows from the fact that alter- 
nating machines with bounded alternation can be simulated by deterministic 
ones with small space overhead (c.f. the theorem attributed in [3] to a personal 
communication from A. Borodin). 

Lemma 7. A tuple {a,q,z,S) is good iff G\lec^^{a,q, z, S) = 1 

Proof 

The proof is by induction on the size of a. The case when a is a propositional 
letter is obvious. The case when a = -<(3 follows from Lemma 1. The case for 
conjunction is easy using Lemma 2. We omit the case for a = 3o (3 because the 
arguments is simpler than in the case of F operator. 

Case a = 3Ff3. Suppose that (a,q,z,S) is good. This means that there is 
an assumption function S such that S'|q. = S and M, qz Ng a. By the definition 
of the semantic, there is a vertex v reachable from qz such that M,v \=s f3 or 
V = q' and 3F(3 G S{q'). Suppose that v is such a vertex at the smallest distance 
from qz. We show that Check(a, q, z, S') = 1 by induction on the distance to v. 

If V = qz then, as /? is a subformula of a, we have by the main induction 
hypothesis that Check(/3, g, z, S) = 1. So Check(a, g, z, S) = 1. If qz ^ q' and 
3FP G S{q') then we also get Check(a, g, z, S) = 1. Otherwise we have qz ^ 
q'z'z and q'z'z is the first vertex on the shortest path to v. 

Suppose that on the path to v there is a configuration of the form q" z for 
some q" . Assume moreover that it is the first configuration of this form on the 
path. We have that Search(g', z', g") = 1 and M,q"z 1= 3F(3. As the distance 
to V from q”z is smaller than from qz, we get Check(3F/3, g", z. S') = 1 by the 
induction hypothesis. Hence Check(a, q, z, S) = 1. 

Otherwise, i.e., when there is no configuration of the form q" z on the path 
to V, we know that v = q"wz' z for some q” G Q and w G F*. Moreover we 
know that q"wz' is reachable from g'zb By Composition Lemma we have that 
M,q"wz' [3. Let Si be a function defined by Si(gi) = (S tz)l/ 3 ( 9 i) U 

{^3Ff3 : f3 ^ S 1z {qi)}U{3FP : /3 G S |z ( 9 i)}- K can be checked that Si can be 
extended to an assumption function Si. By Lemma 2 we have M,q"wz' f3. 
Hence M,q'z' Ng^ 3Fp. We have Check{3FP,q' , z' , Si) = 1 from induction 
hypothesis. By definition of Si and the induction hypothesis we have that 
Si(gi) = {7 G cl(/3) : Check( 7 , gi, z, S) = 1} U {^3F/3 : Check(/3, gi, z, S) = 
0} U {3F(3 : Check(/3, gi , z, S) = 1}. Which gives Check(a, q,z,S) = 1. 
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For the final case suppose that a = 3F(3 and that Check(a, q,z,S) = 1. We 
want to show that (a, q, z, S) is good using additional induction on the length 
of the computation of Check(a, q, z, S'). Let S be an assumption function such 
that S\a = S. 

Skiping a couple of easy cases suppose that there is qz ^ q'z'z and that 
we have Check{3 FP,q',z\S') = 1 for s' defined by S' {q") = {76 cl(/3) : 
Check( 7 , g", z, S) = 1} U {^3Ff3} or S\q”) = {76 cl(/3) : Check( 7 , g", z, S) = 
l}U{3F/3} depending on whether Check(/3, g", z, S) = 0 or not. By the induction 
hypothesis, M,q'z' Ng/ 3FP for an assumption function S' such that S'|a = S 
Consider S We have that S |z 1/3 = S '|/3 by the induction hypothesis. 
It is also the case that for every g" S Q, whenever 3F(3 G S'(g") then 3FP G 
S tz (<z")- Hence, by Lemma 2 and the definition of our semantics, we have that 
M,q'z' 3Ff3. By Composition Lemma we have M, q'z'z Ng 3F(3. Which 
gives M, qz Ng 3Fj3. So (a, g, z, S) is good. □ 



5 Model Checking CTL 

In this section we show that the model checking problem for pushdown systems 
and CTL is EXPTIME hard. The problem can be solved in EXPTIME as there 
is a linear translation of CTL to the //t-calculus and the model checking for the 
later logic can be done in EXPTIME [9]. 

Let M be an alternating Turing machine using n tape cells on input of size n. 
For a given configuration c we will construct a pushdown system P^, valuation 
Pm, and a CTL formula om such that: M{P^, pM),qo3- 1= ctM iff M has an 
accepting computation from c. As P^ and om will be polynomial in the size of 
c this will show EXPTIME hardness of the model checking problem. 

We will do the construction in two steps. First, we will code the acceptance 
problem into the reachability problem for a pushdown system extended with 
some test operations. Then, we will show how to simulate these tests in the 
model checking problem. 

We assume that the nondeterminism of M is limited so that from every 
configuration M has at most two possible moves. A move is a pair m = (a, d) 
where a is a letter to put and d is a direction for moving the head. We use c hm c' 
to mean that c' is obtained from c by doing the move m. The transition function 
of M assigns to each pair (state, letter) a pair of moves of M. A computation of 
M can be represented as a tree of configurations. If the machine is in a universal 
state then the configuration has two sons corresponding to the two moves in the 
pair given by the transition function. If the machine is in an existential state 
then there is only one son for one of the moves from the pair. 

An extended pushdown system is obtained by adding two kinds of test transi- 
tions. Formally each of the kinds of transitions depends on a parameter n which is 
a natural number. To make notation clearer we fix this number in advance. Tran- 
sition g g' checks whether the first n letters from the top of the stack form 
an accepting configuration of M. Transition g q' checks, roughly, whether 
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the first 2 n letters from the top of the stack form two configurations such that 
the first is the successor of the second. A formal definition of these transitions 
is given below when we define a particular extended pushdown system. 

Let us fix n as the size of input to our Turing machine. We define an extended 
pushdown system EPm simulating computations of M on inputs of size n. The 
set of states of the system is Q = {q,qM,qA}- The stack alphabet is F = Fm U 
Qm UMovesM X MovesM U {E, L, i?}; where Fm is the tape alphabet of M, Qm 
is the set of states of M; MovesM is the set of moves of M; and E, L, R are 
new special letters which stand for arbitrary, left and right element of a pair 
respectively. Before defining transitions of EPm let us formalize the definition 
of and transitions. These transitions add the following edges in the 
graph of configurations of the system: 

— For a transition q q' and for an arbitrary w G F* we have the edge 
qcw q'cw if c is an accepting configuration of M. 

— For a transition q q' , for an arbitrary w G F* and a letter ? G {E, L, R} 

we have the edge qc'l{mi,m2)cw q'c'?{mi,m2)cw if {mi, m2) is the move 
form a configuration c and c \~m c' where m = mi if ? = L; m = m2 if ? = R', 
and m G {mi, TO2} if ? = if. 

Finally, we present the transition rules of EPm- Below, a' stands for any 
letter other than E, L or R. We use c, c' to stand for a configuration of M, i.e., 
a string of length n + 1 . 

q^ qA qAC q 

qa' ^ qMc' L{mi,m2)a' qm q 

qa' ^ qMc' E{m)a' qL{mi,m2) ^ qMc' R{mi,m2) 

qR{mi,m2)c^q qE{m)c ^ q 

It is easy to see that the transitions putting or taking a whole configuration 
from the stack can be simulated by a sequence of simple transitions working 
with one letter at the time. In the above, transition qAC q (which removes a 
configuration and at the same time checks whether it is accepting) is not exactly 
in the format we allow. Still it can be simulated by two transitions in our format. 
We use G{EPm) to denote the graph of configurations of EPm, i-C., the graph 
which vertices are configurations and which edges correspond to one application 
of the transition rules. 

The idea behind the construction of EPm is described by the following 
lemma. 

Lemma 8 . For every configuration c of M we have that: M accepts from c iff 
in the graph G{EPm) of configurations of EPm configuration q is reachable from 
configuration qc. 

Proof 

We present only a part of the argument for the left to right direction. The 
proof proceeds by induction on the height of the tree representing an accepting 
computation of M on c. 
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If c is an accepting configuration then we have a path qc ^ qac ^ g in 
G{EPm). 

Suppose now that the first move of M in its computation is {mi, m2) and it 
is an existential move. Then we have a path: 

qc qMc' E{mi,m2)c qc' E{mi,m2)c ^ ^ qE{m\, m2)c q 

where the existence of a path qc' E{mi,m2)c ^ ^ qE{mi,m2)c follows from 

the induction hypothesis. 

Suppose now that the first move of M in an accepting computation from c 
is {mi, m2) and it is a universal move. We have a path: 

qc — > qMc' L{mi,m2)c qc' L{mi, m2)c ^ ^ qL{mi,m2)c 

qMc" R{mi,m2)c qc" R{mi,m2)c ^ • • • — > qR{mi,m2)c q. 

Once again the existence of dotted out parts of the path follows from the induc- 
tion hypothesis. 

This completes the proof from the left to right direction. The opposite direc- 
tion is analogous. □ 

The next step in our proof is to code the above reachability problem into 
the model checking problem for a normal pushdown system. First, we change 
extended pushdown system EPm into a normal pushdown system Pm- We add 
new states qrA, Qtm, Qf and for every letter a of the stack alphabet. The 
role of qxA and qxM is to initiate test performed originally by and 
transitions, respectively. State qp is a terminal state signalling success. States 
g^ are used in the test. They take out all the letters from the stack and give 
information about what letters are taken out. In the rules below c, c' range over 
configurations; a, b over single letters; and a' over letters other than E, L or R. 

qa' ^ qAa' qA ^ q, Qta 

qa' ^ qMc'L{mi,m2)a' gM ^ g, Qtm 

qa' ^ qMc' E{mi,m2)a' qL{mi,m2) ^ gMc'i?(mi, m2) 

qR{mi,m2)c^q qE{mi,m2)c^q 

qrAO- ^ Qr Qtaio. Qr 

qV^ ^ qR 9-L ^ qpE 

Recall that T is the initial stack symbol of a pushdown automaton. As before 
we use G{Pm) to denote the graph of configurations of Pm- 

To simplify matters we will use states also as names of propositions and take 
valuation pM such that in a state q' exactly proposition g' holds, i.e., pM{q') = 

{q'}- 

First we take two EF formulas Accept and Move such that: 

— M{Pm, PM),qTAW 1 = Accept iff w starts with an accepting configuration of 
M. 
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— M{Pm, Pm),Qtmw \= Move iff w is of the form c'l{mi,m 2 )cw' , {mi, m 2 ) is 
the move of M, and c \~m d where m = toi if ? = L; to = m 2 if ? = i?; and 
TO € {to-1, TO 2 } if ? = if. 



From states qrA and qtm the behaviour of Pm is deterministic. It only takes 
letters from the stack one by one. The formula Accept is Vi=i n+i ^ ° 
where signals an accepting state of M. The formula Move is slightly more 
complicated as it needs to code the behaviour of M. Still its construction is 
standard. 

The formula we are interested in is: 



a = 3[{qV qa'^ qM) A {qA 3o{qrpA A Accept)) A 

(?M =k 3o {qrpM A Move))] U qp 

It says that there is a path going only through states q, qA or qM and ending 
in a state qp. Moreover, whenever there is a state qA on the path then there 
is a turn to a configuration with a state qpA from which Accept formula holds. 
Similarly for qM- 



Lemma 9. For every word w over the stack alphabet: q is reachable from qw in 

G{EPm) iffM{PM,PM),qwP^a. 



Proof 

The proof in both directions is by induction on the length of the path. We will 
only present a part of the proof for the direction from left to right. 

If in G{EPm) the path is qw qAW q then in G{Pm) we have: 
qw3 ^ qaW± ■ q± 



qpAwP 



The edge qAW q exists in G{EPm) only if w is an accepting configuration. 
Hence, we have that M{Pm, PM),qTAW 1= Accept and consequently we have the 
thesis of the lemma. 

If the path is qw qMc'l{mi,m 2 )w qc'l{mi,m 2 )w ^ ■ then in G{Pm) 

we have: 

qw3 ^qMcG{mi,m 2 )wJ- . qc'l{mi,m2)w3 . ... 

^gTMc'?(TOi,TO2)wT 

The edge < 7 mc'?(toi, TO 2 )w ^ qc'l{mi,m 2 )w exists in G{EPm) only when the 
stack content c'?(toi, to- 2 )w satisfies the conditions of transition. This means 
that M{Pm , PM),qTMc'l{mi,m2)wl- 1= Move. From the induction assumption 
we have M(Pm, Pm), 9 c'?(toi, TO 2 )w 1= a. Hence M{Pm, PM),qw 1= a. □ 



Theorem 1. The model checking problem for pushdown systems and CTL is 
EXPTIME-complete 
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Proof 

The problem can be solved in EXPTIME as there is a linear translation of CTL 
to the /i-calculus and the model checking for the later logic can be done in 
EXPTIME [9]. 

To show hardness part let M be an alternating Turing machine as considered 
in this section. For an input word v of length n we construct in polynomial time 
a pushdown system valuation pM and a formula om such that: v is accepted 
by M iff M(P^, Pm), <Zo-L 1= 3o”+ia. Let Cq be the initial configuration of M on 
V. It has the length n + 1. 

Valuation pM and formula um are p and a as described before Lemma 9. 
The system is such that started in qg!. it first puts the initial configuration 
Cg on the stack and then behaves as the system Pm- 

By Lemma 8 we have that M has an accepting computation from Cg iff there 
is a path from gcg to q in G{EPm)- By Lemma 9 this is equivalent to the fact 
that M{Pm, Pm), <ZCgT 1= om- By the construction of this the same as saying 
that M(Pj(;f, Pm), go-L 1= 3o"+iaM- □ 
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A Decidable Dense Branching-Time Temporal 

Logic* 

Salvatore La Torre^’^ and Margherita Napoli^ 

^ University of Pennsylvania 
^ Universita degli Studi di Salerno 



Abstract. Timed computation tree logic (Tctl) extends Ctl by al- 
lowing timing constraints on the temporal operators. The semantics of 
Tctl is defined on a dense tree. The satisfiability of Tctl- formulae is 
undecidable even if the structures are restricted to dense trees obtained 
from timed graphs. According to the known results there are two possible 
causes of such undecidability: the denseness of the underlying structure 
and the equality in the timing constraints. We prove that the second one 
is the only source of undecidability when the structures are dehned by 
timed graphs. In fact, if the equality is not allowed in the timing con- 
straints of Tctl- formulae then the finite satisfiability in Tctl is decid- 
able. We show this result by reducing this problem to the emptiness prob- 
lem of timed tree automata, so strengthening the already well-founded 
connections between finite automata and temporal logics. 



1 Introduction 

In 1977 Pnueli proposed Temporal Logic as a formalism to specify and verify 
computer programs [Pnu77] . This formalism turned out to be greatly useful for 
reactive systems [HP85], that is systems maintaining some interaction with their 
environment, such as operating systems and network communication protocols. 
Several temporal logics have been introduced and studied in literature, and now 
this formalism is widely accepted as specification language for reactive systems 
(see [Eme90] for a survey). 

Temporal logic formulae allow to express temporal requirements on the oc- 
currence of events. Typical temporal operators are “until”, “next”, “sometimes”, 
“always” , and a typical assertion is “p is true until q is true” . These operators 
allow us only to express qualitative requirements, that is constraints on the tem- 
poral ordering of the events, but we cannot place bounds on the time a certain 
property must be true. As a consequence traditional temporal logics have been 
augmented by adding timing constraints to temporal operators, so that asser- 
tions such as “p is true until q is true within time 5” can be expressed. These 
logics, which are often referred to as real-time or quantitative temporal logics, 
are suitable when it is necessary to explicitly refer to time delays between events 
and then we want to check that some hard real-time constraints are satisfied. 

* Work partially supported by M.U.R.S.T. grant TOSCA. 



S. Kapoor and S. Prasad (Eds.): FST TCS2000, LNCS 1974, pp. 139—150, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




140 Salvatore La Torre and Margherita Napoli 



Besides the usual classification in linear and branching-time logics, real-time 
logics are classified according to the nature of the time model they use. Temporal 
logics based on discrete time models are presented in [EMSS90, JM86, Koy90, 
PH88]. An alternative approach is to model time as a dense domain. Temporal 
logics with this time model are Mitl [AFH96], Tctl [ACD93], Stctl [LN97], 
and Gctl [PH88]. For more about real-time logics, see [AH93, Hen98]. 

In this paper we are interested in branching-time temporal logics which use 
a dense time domain and in particular we will consider the satisfiability problem 
in Tctl that was introduced by Alur et al. in [ACD93]. Given a formula Lp we 
want to determine if there exists a structure M satisfying it. The syntax of Tctl 
is given by augmenting the temporal operators of Gtl [CF81] (except for the 
“next” which is discarded since it does not have any meaning in a dense time 
domain) with a timing constraint of type « c, where « is one among <, <, >, >, 
and =, and c is a rational number. The semantics of Tctl is given on a dense 
(or continuous) tree. It turns out that the satisfiability problem in Tctl is 
undecidable even if the semantics is restricted to dense trees obtained from timed 
graphs {finite satisfiability [ACD93]), that is, timed transition systems where the 
transitions depend also on the current value of a finite number of clock variables. 

Another real-time branching-time temporal logic is Stctl [LN97] which is 
obtained by restricting both the semantics and the syntax of Tctl. Instead 
of a dense tree, a timed w-tree is used to define the semantics of formulae, 
and the equality is not allowed in the timing constraints. With these restric- 
tions the STCTL-satisfiability problem turns out to be decidable. This result 
is obtained by reducing the STCTL-satisfiability problem to the the emptiness 
problem of finite automata on timed w-trees, which is shown to be decidable in 
[LN97]. Introducing the equality in the timing constraints causes the loss of the 
decidability. A similar kind of result was observed in Mitl, where the decid- 
ability is lost when the restriction to non-singular intervals is relaxed [AFH96]. 
In this paper we prove that this indeed holds also for the finite satisfiability in 
Tctl. In particular, we reduce the finite satisfiability problem of TCTL-formulae 
without equality in the timing constraints, to the emptiness problem of timed 
tree automata, via translation to the satisfiability problem of TCTL-formulae 
with respect to a proper subclass of STCTL-structures. Restricting the class of 
STCTL-structures is necessary since along any path of a timed graph the truth 
assignments of the atomic propositions vary according to a sequence of left-closed 
right-open intervals, while in general in STCTL-structures the truth assignments 
change according to sequences of time intervals which are alternatively singular 
and opened. Having defined the language of a logic as the set of formulae which 
are satisfiable, as a consequence of the previous result we have that Tctl inter- 
preted on timed graphs is language equivalent to a proper restriction of Stctl. 
Moreover, in this paper we also introduce a concept of a highly-deterministic 
timed tree automaton with the aim of matching the concept of regular tree in 
w-tree languages. The use of the theory of timed tree automata to achieve the de- 
cidability of the Tctl finite satisfiability, strengthens the relationship between 
finite automata and temporal logics, also in the case of real-time logics. In a 
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recent paper [DW99] an automata-theoretic approach to TCTL-model checking 
has been presented. There the authors introduced timed alternating tree auto- 
mata and rephrased the model-checking problem as a particular word problem 
for these automata. For timed alternating tree automata, this decision problem 
is decidable while the emptiness problem is not decidable. 

The rest of the paper is organized as follows. In section 2 we recall the main 
definitions and results from the theory of timed tree automata, and we introduce 
a concept of highly-deterministic timed tree automaton. In section 3 we recall 
the temporal logics Tctl and Stctl with the related decidability results. The 
main result of this paper is presented in section 4, where the finite satisfiability 
in Tctl is shown to be decidable via reduction to the emptiness problem of 
timed tree automata. Finally, we give our conclusions in section 5. Due to lack 
of space some proofs are omitted, for a full version of the paper see [URL]. 

2 Timed Tree Automata 

In this section we recall some definitions and results concerning to timed au- 
tomata [AD94, LN97], and introduce the concept of highly-deterministic timed 
tree automaton. 

Let U be an alphabet and dom{t) be a subset of {1, ... , k}*, for an integer k > 
0, such that (i) e G dom{t), and (ii) if u G dom{t), then for some j G {1, . . . , k}, 
vi G dom{t) for any i such that 1 < i < j and vi ^ dom(t) for any i > j. A 
S -valued u-tree is a mapping t : dom{t) — > E. For v G dom{t), we denote with 
pre{v) the set of prefixes and with deg(v) the arity of v. A path in t is a maximal 
subset of dom{t) linearly ordered by the prefix relation. Often we will denote a 
path 7 T with the ordered sequence of its nodes uq, wi, U 2 , ■ • ■ where vq is e. A timed 
E -valued co-tree is a pair (t,r) where t is a U-valued w-tree and r, called time 
tree, is a mapping from dom(t) into the set of the nonnegative real numbers 5R+ 
such that (i) t{v) > 0, for each v G dom{t) — {e} {positiveness) , and (ii) for each 
path 7T and for each x G 5R+ there exists v G tt such that X]«6pre(t)) ^ 

{progress property). Nodes of a timed w-tree become available as the time elapses, 
that is, at a given time only a finite portion of the tree is available. Each node 
of a timed w-tree is labelled by a pair (symbol, real number): for the root the 
real number is the absolute time of occurrence, while for the other nodes is the 
time which has elapsed since their parent node was read. Positiveness implies 
that a positive delay occurs between any two consecutive nodes along a path. 
Progress property guarantees that infinitely many events (i.e. nodes appearing 
at input) cannot occur in a finite slice of time {nonzenoness). We denote with 
the absolute time at which a node v is available, that is 7 t, = J2uepre(v) 
the rest of the paper, we will consistently use 7 to denote absolute time, i.e. time 
elapsed from the beginning of a computation, and r to denote delays between 
events. Moreover, we will use the term tree to refer to a U-valued w-tree for some 
alphabet E and the term timed tree to refer to a timed 27-valued w-tree. 

Now we recall the definition of timed Biichi tree automaton. It is possible to 
extend this paradigm by considering other acceptance conditions such as Muller, 
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Rabin, or Streett [Tho90]. Timed Muller tree automata as well as timed Biichi 
tree automata were introduced and studied in [LN97]. To define timed automata 
we introduce the notion of clock, timing constraint, and clock valuation. A finite 
set of clock variables (or simply clocks) is used to test timing constraints. Each 
clock can be seen as a chronograph which is synchronized to a unique system 
clock. Clocks can be read or set to zero (reset): after a reset, a clock automatically 
restarts. Timing constraints are expressed by clock constraints. Let C be a set of 
clocks, the set of clock constraints S'(C') contains boolean combinations of simple 
clock constraints of type x<y + c, x>y + c, x<c, and x > c, where x,y G C 
and c is a rational number. A clock valuation is a mapping v : C — > 3?+. If v 
is a clock valuation, A is a set of clocks and d is a real number, we denote with 
[A ^ 0] (jz + d) the clock valuation that gives 0 for each clock x G X and ^{x) + d 
for each clock x ^ X. 

A Biichi timed tree automaton is a 6-tuple A = (A7, S, So, C, A, F), where: 

• A is an alphabet; 

• S' is a finite set of locations; 

• So C S is the set of starting locations; 

• C is a finite set of clocks; 

• A is a finite subset of Ufc>o(‘^ x A x S^ x (2'")^ x A(C)); 

• F C S is the set of accepting locations. 

A timed Biichi tree automaton A is deterministic if |So| = 1 and for each 
pair of different tuples (s, cr, si, . . . , Sk, Ai, . . . , Xu, S) and (s, a, . . . , sj,, A)^, . . . , 
X'^,6') in A, 6 and S' are inconsistent (i.e., S AS' =false for all clock valuations). 

A state system is completely determined by a location and a clock valua- 
tion, thus it is denoted by a pair (s, v). A transition rule (s, <j, si, . . . , Sfe, Ai, . . . , 
Xk,S) G A can be described as follows. Suppose that the system is in the state 
(s, ir), and after a time r the symbol a is read. The system can take the transition 
(s, cr, si, . . . , Sfc, Ai, . . . , Afe, (5) if the current clock valuation (i.e. ly+r) satisfies the 
clock constraint S. As a consequence of the transition, the system will enter the 
states (si, jzi), . . . , (sfe, i^k) where i^i = [Ai ^ 0 ](ic-|-t), . . . ,Vk = [Xk ^ 0](rc-|-r). 
Each node of a timed tree has thus a location and a clock valuation assigned, 
according to the transition rules in A. Formally, this is captured by the concept 
of run. A run of A on a timed tree (t, r) is a pair (r, v), where: 

• r : dom{t) — > S and v : dom{t) — > 3?^; 

• r(e) G So and v{s) = vo, where vo{x) = 0 for any x G C] 

• for V G dom{t), k = deg{v): (r(v), t(v), r(vl ), . . . , r{vk), Ai, . . . , Xk, S) G A, 

v{v) + t{v) fulfils S and v{vi) = [Ai ^ 0 ](jz(u) -I- t{v)) Vz G {1, . . . , fc}. 

Clearly, deterministic timed automata have at most one run for each timed 

tree. A timed tree (t, r) is accepted by A if and only if there is a run (r, v) of 
A on (t, t) and a path tt such that r(u) G F for infinitely many u on tt. The 
language accepted by A, denoted by T{A), is the set of all timed trees accepted 
by A. In the following we refer to (timed) Biichi tree automata simply as (timed) 
tree automata. 
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For a timed tree automaton the set of states is infinite. However, they can be 
finitely partitioned according to a finite-index equivalence relation over the clock 
valuations. Each equivalence class, called clock region, is defined in such a way 
that all the clock valuations in an equivalence class satisfy the same set of clock 
constraints from a given timed automaton (see [AD94] for a precise definition). 
Given a clock valuation u, [u] denotes the clock region containing v. A clock 
region a' is said to be a time-successor of a clock region a if and only if for any 
ly € a there is a, d G 3?+ such that w d G a'. The region automaton of a timed 
tree automaton A is a transition system defined by: 

• the set of states R{S) = {{s, a) \ s G S and a is a clock region for A}; 

• the set of starting states R{So) = {(so, cro) I sq G So and oq satisfies a: = 0 
for all X G C}; 

• the transition rules R{A) such that: ((s, a), a, (si, oi), . . . , (sfe, ak)) G R(A) 
if and only if (s, <t, si, . . . , Sfc, Ai, . . . , Afc, d) G A and there is a time-successor 
a' of a such that a' satisfies d and Oj = [Ai ^ 0]a' for alH G {1, . . . , fc}. 

The region automaton is the key to reduce the emptiness problem of timed 
tree automata to the emptiness problem of tree automata. Given a timed tree 
language T, Untime(r(A)) is the tree language {t | (t,r) G T}. We will denote 
by R(A) the timed tree automaton accepting Untime(T(A)) and obtained by 
the region automaton (see [LN97] for more details). 

Theorem 1. [LN97] For timed Biichi tree automata: 

• Emptiness problem is decidable in time exponential in the length of timing 
constraints and polynomial in the number of locations. 

• Closure under union and intersection holds. 

We end this section by introducing for timed tree automata a concept which 
captures some of the properties that regular trees have in the context of tree 
languages. We will use this notion to relate timed tree automata to timed 
graphs. A timed tree automaton A = (E, S, So, A, C, F) is said to be highly 
deterministic if Untime(T(A)) contains a unique tree, and for s G S', e = 
(s, cr, si,...,Sfc,Ai,...,Afe,(5) G A and e' = (s, ct', s), . . . , A), . . . , A'^, i5') G A 

imply that e = e' . The second property of highly-deterministic timed tree auto- 
mata simply states that there is at most one transition rule that can be executed 
in each location s G S. A timed tree automaton A' = {E, S' , Sq, A' ,C, F') is 
contained in A = {E, S, So, A,C, F) if S' C S, Sg C Sq, A' C A, and F' C F. 
Glearly, T{A') C T(A) holds. We recall that a regular tree contains a finite num- 
ber of subtrees. Given a timed tree automaton A = {E, S, So, A,C, F), and a 
regular run r of R{A) on a regular tree t G T(i?(A)), we define a shrink of r 
and t as the labelled directed finite graph G = (V,E,lab) such that there is a 
mapping 9 : domft) — > V such that: 

• for any u,u' G domff), 9{u) = 9{u') implies that deg{u) = deg{u'), and for 
each i = 1, . . . , deg{u), 9{ui) = 9(u'i); 

• E = {{9{u),9{ui),i) |m G domft) and i < deg{u)}, and (v,v',i) G if is an 
edge from v to v' labelled by i; 

• for V GV, lab{v) = {r{u),t{u)) for any u such that v = 9{u). 
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From the definition of regular tree, such a graph G always exists. Thus, the 
following theorem holds. 

Theorem 2. Given a timed tree automaton A, T{A) is not empty if and only 
if there exists a highly- deterministic timed tree automaton contained in A. 

Later in the paper we will use the following property. Given a highly-determi- 
nistic timed tree automaton A, there exists a highly-deterministic timed tree au- 
tomaton A' such that T{A) = T{A') and for each transition rule (s, cr, si, . . . , Sk, 
Ai, . . . , Afe, S) of A' we have that Si yf Sj for i yf j. We call such an automaton 
a graph-representable timed tree automaton, since it corresponds to a labelled 
directed graph such that for any ordered pair of locations (s, s') there is exactly 
an edge connecting s to s' in the graph. 



3 Timed Computation Tree Logic 

In this section we recall the real-time branching-time temporal logics Tctl 
[ACD93] and Stctl [LN97]. 

Let AP be a set of atomic propositions, the syntax of Tctl- formulae is given 
by the following grammar: 

(p:=p\^(p\ipA(fi\ 3[(pU~cg:’] \ V[(pC/~c7’] 

where p G AP, {<, <, >, >}, and c is a rational number. Notice that 
the TCTL-syntax given in [ACD93] allows the use of equality in the timing 
constraints. Here we restrict the syntax to obtain our decidability result. 

Before giving the semantics of Tctl, we introduce some common notation. 
The constant False is equivalent to ipA^ip, the constant True is equivalent to ^ 
False, <>~c‘P are equivalent to P'R.V'eU~c‘P and respectively. 

In the rest of the paper with AP we denote the set of atomic propositions of the 
considered TCTL-formulae. If it is not differently stated, with « we refer to a 
relational operator in {<,<,>,>}, and with c to a rational number. We define 
a dense path through a set of nodes S' as a function p : — > S. With pi we 

denote the restriction of p to an interval / and with P[o,h) • p' the dense path 
defined as (p[o,6 ) ' p'){d) = p{d), if d < b, and (p[o,6) ■ p'){d) = p'{d—b), otherwise. 
The semantics of Tctl is given with respect to a dense tree. A A-valued dense 
tree M is a triple (S, p, /) where: 

• S is a set of nodes; 

• p : S — > A is a labelling function; 

• / is a function assigning to each s G S a set of dense paths through S, 
starting at s, and satisfying the tree constraint. Vp G /(s) and Vt G 3?+, 
P[o.t) • f{p{t)) C /(s). 

Given a 2"^^-valued dense tree M = (S,p,f), a state s, and a formula <p, p 
is satisfied at s in M if and only if M, s \= ip, where the relation \= is defined as 
follows: 
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• for p G AP, M, s ^ p if and only if p e p{s); 

• M, s ^ if and only if not{M, s |= ■0); 

• M, s ^ 01 A 02 if and only if M, s |= 0i and M,s \= 02; 

• M,s \= 3[0i[/~c 02] if and only if 3p G f{s) and 3d « c such that M, p{d) ^ 
02 and for each d' such that 0 < d' < d, M, p{d') ^ 0i; 

• M,s \= V[0iC/~c02] if and only if Vp G /(s), 3d « c such that M, p{d) \= 02 
and M, p{d') |= 0i for each d' such that 0 < d' < d. 

We say that M is a TCTL-model of (p if and only if M, s ^ (p for some s G S. 
Moreover, a TCTL-formula ip is said to be satisfiable if and only if there exists a 
TCTL-model of p. 

We define the closure of a TCTL-formula p, denoted by cl{p), as the set of 
all the subformulae of p and the extended closure, denoted by ecl{p), as the set 
cl{p) U {-'0 I 0 G cl{p)}. Moreover, we define Sp C as the collection of 

sets W with the following properties: 

• 0 G Ip -.0 ^ tp; 

• 01 A 02 G S' 01 G S' and 02 G 

• a[0il7~c02] G 'P, a G {V, 3} 0i G S' or (02 G S' and (0 « c)); 

• Ip is maximal, that is for each 0 G ecl{p): either 0 G >?' or ->0 G 

Note that Sp contains the maximal sets of formulae in eel {p) which are consis- 
tent, in the sense that given an ip G S'tp and a dense tree (S', p, /), the fulfilment 
at a given s G S of a formula in !P does not prevent all the other formulae in 
Ip from being satisfied at s. From now on we only consider TCTL-formulae, thus 
we will refer to them simply as formulae. In the rest of this section we recall two 
semantic restrictions to Tctl that have been considered in literature. 

3.1 Finite Satisfiability 

A first restriction of TCTL-semantics consists of considering only dense trees 
defined by runs of a timed graph [ACD93]. A timed graph is a tuple G = 
(V, p. So, E, C, A, ^), where: 

• y is a finite set of vertices; 

• p : y — > 2^^ is a labelling function; 

• So is the start vertex; 

• if G y X y is the set of edges; 

• C is a finite set of clocks; 

• A ■. E — > 2'^ maps each edge to a set of clocks to reset; 

• ^ : E — > S{C) maps each edge to a clock constraint. 

A timed graph is a timed transition system, where vertices correspond to lo- 
cations and edges to transitions. A state is given by the current location and the 
array of all clock values. When a clock constraint is satisfied by the clock valua- 
tion of the current state, the corresponding transition can be taken. A transition 
e forces the system to move, instantaneously, to a new state which is described 
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by the target location of e, and the clock values obtained by resetting the clocks 
in the reset set of e. Any computation of the system maps reals to states. This 
concept is captured by the notion of run. Given a state (s, ly) of a timed graph 
G, an (s, j^)-run of G is an infinite sequence of triples (si, vi, t\), (s2> ^2, T2), ■ ■ ■ 
where: 

• Si = s, = V, and t\ = 0; 

• for i > 1 Si € S , Ti € 3?+, and Vi is a clock valuation; 

• e* = (si,Sj+i) e E, i^i+i = [A(ei) Q]{vi + Tj+i), {ut + Ti+i) satisfies 
the enabling condition ^(ci), and the series of reals Ti is divergent (progress 
condition) . 

An (s, j/)-run can be also seen as a real-valued mapping p{d) defined as p{d) = 
(si, Vi + d — 7i) for d € 3?+ such that < d < 7^+1 {p is also said to be a dense 
path of G) . Notice that a dense path p gives for each time a truth assignment of 
the atomic propositions. Moreover, the truth values stay unchanged in intervals 
of type [7i,7i_|_i). The dense tree M defined by a timed graph G is a tuple 
{S X 3?”,^',/) where p.'{s,v) = p,{s) and /(s, is the set of all the paths 
corresponding to (s, i^)-runs of G. For a formula tp, we say that G \= (p ii and 
only if M, (sq, vq) |= where vq{x) = 0 for any clock x G G. Thus a formula (p is 
finitely satisfiable if and only if there exists a timed graph G such that G \= (p. 



3.2 Restricting the Semantics to Timed Trees 

In this section we recall the temporal logic Stctl which is obtained restricting 
the TCTL-semantics to dense trees obtained from 2^^ x 2"^^-valued w-trees. An 
Stctl - structure is a timed 2^^ x 2"^^-valued w-tree (t,r) with t{e) = 0. Given 
an SxCTL-structure (t, r) we denote by topen and tsing the functions defined as 
{topen{v),tsing{v)) = t(v) for each V G dom{t) . An open and a singular interval 
along the paths in (t, r) correspond to each node v ^ e: topen(v) and tsing (v) 
are the sets of the atomic propositions which are true in these two intervals. 
For V = e, only tsingi^) is meaningful. Given a path tt = vq,v\,V 2 , ■ ■ ■ in an 
SxCTL-structure a dense path in (t, r) corresponding to tt and shifted by 

d is a function pj : — > 2^^ such that for any natural number i\ 

’Td'! = / if d -I- d' = 7„, 

\ topen(Vi+i) if 7„, < d -k d' < 7„,^i . 

Thus any dense path in (t, t) corresponds to a sequence of alternatively open and 
singular intervals where the truth values stay unchanged. Glearly, an STGTL- 
structure has a dense time semantics on paths and a discrete branching-time 
structure. In particular, an SxcxL-structure (t, t) defines the dense tree = 
(S', /X, /) where (1) S = {(vi,d) |f G dom{t) and 0 < d < r(xz)} U {(e, 0)}, (2) 
/i(e, 0) = tsingi.e), p,{vi,d) = topenivi) if d < T{vi), and p,(vi,d) = tsmgivi) 
otherwise, and (3) /(£, 0) is the set of all dense paths pj of (f)''') f{vi,d) 
is the set of all the dense paths of (t, r). For a formula (p, we say that 

(t, r) ^ ip, i.e. (t, r) is an SxcxL-model of <p, if and only if 0) ^ ip. 
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Thus a formula ip is SxCTL-safzs/iaWe if and only there exists an SxCTL-model 
of p. 

In [LN97] the problem of SxcxL-satisfiability is reduced to the emptiness 
problem of timed tree automata. In particular, given a formula p it is possible 
to construct a timed tree automaton accepting a nonempty language if and only 
if p is SxcxL-satisfiable. Moreover, all the accepted trees are SxcxL-models of 
p. The corresponding construction leads to the following results. 

Theorem 3. [LN97] Given a formula p, if p is STCTL-satisfiable then there 
exists an STCTL-model ft, r) of p such that: 

• for each v G domft), degfu) < 2 max^gg^^ |{3V' I 3 V’ € s}| -I- 1, and 

• there exists a mapping rj : domft) — > Sip x Sp such that fu, d) \= -tp 

for each G p{v,d), where = (S', /r, /). 

Moreover, there exists a timed uj-tree automaton Ap with states and 

timing constraints of total size 0(|(/?|) such that ft,r) is an STCTL-model of p 
satisfying the above properties if and only if ft,r) G T{Ap). 

By the above Theorems 1 and 3 the satisfiability problem in SxcxL is decid- 
able in exponential time. 

4 Decidability of Finite Satisfiability 

In this section we prove the main result of this paper. We show that the finite 
satisfiability of formulae is decidable. This result is obtained by proving that a 
formula is finitely satisfiable if and only if is satisfiable over a particular class 
of SxcxL-structures, the left-closed right-open SxcxL-structures. Then we show 
that the satisfiability of formulae on these structures is decidable via a reduction 
to the emptiness problem of timed tree automata. Finally, we prove that the 
set of formulae which are SxcxL-satisfiable strictly contains the set of finitely- 
satisfiable formulae. 

Let Lpin be the language of formulae that are finitely-satisfiable. We start 
providing a characterization oiLpin based on a subclass of SxcxL-structures. Let 
(t, r) be an SxcxL-structure, ft, t) is said to be a left-closed right-open SxcxL- 
structure if tsingfv) = topenfvi) for any v G dom(t) and 1 < i < degfv). Before 
to show that the set of formulae which are finitely-satisfiable is exactly the set of 
formulae which are satisfiable over left-closed right-open SxcxL-structures, we 
prove that the existence of a left-closed right-open SxcxL-model of a formula 
is decidable. The decision procedure we give is obtained, as for the SxcxL- 
satisfiability, via a reduction to the emptiness problem of timed tree automata. 



Lemma 1. Given a formula p, there exists a timed tree automaton A such that 
(1) T{A) is not empty if and only if there is a left-closed right-open SxcxL- 
model of p, and (2) for each ft,r) G T{A) there exists a function g : domft) — > 
Sp X Sp such that M*’'^ , {v, d) \= ^p for each ip G pfv, d), where = (S', p, /). 
Moreover, the existence of a left-closed right-open STCTL-model of p can be 
checked in exponential time. 
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The next two lemmata show that the finitely-satisfiable formulae are exactly 
the formulae which are satisfiable over left-closed right-open SxCTL-structures. 

Lemma 2. Given a formula ip, if (p is finitely satisfiable then (p is satisfiable on 
a left-closed right-open STCTL-stractwre. 

Proof. Let G be a timed graph such that G \= ip. For each subformula ip = 3ip' 
of p such that G \= ip,vie denote by a dense path in G such that ip is satisfied 
on Let n be the set of all these paths p.^. If II is empty, then we add to 77 
an arbitrary dense path of G. Consider the dense tree obtained deleting all the 
paths from G but the paths in 77. Since there are only a finite number of such 
paths, this tree can be mapped into a left-closed right-open STCXL-structure 
{t, t) such that {t, r) |= p. 

Lemma 3. Given a formula p, if p has a left-closed right-open STCTL-model 
then p is finitely satisfiable. 

Proof. From Lemma 1 we have that there exists a timed tree automaton Atp 
accepting left-closed right-open SxcxL-models of p, if there are any. We can 
consider a new timed tree automaton accepting 2"^^-valued w-trees ob- 
tained from the timed trees (t,r) G T{Aip) by disregarding topenfv) for each 
V G domft) (we recall that for left-closed right-open SxcxL-structures tsmgfv) = 
topenfvi))- Clearly, T{A'^p) is not empty, and hence by Theorem 2, there ex- 
ists a highly-deterministic timed tree automaton contained in and, as a 
consequence, there exists a graph-representable timed tree automaton A' = 
(2'^'^, S' , So, A' , S') such that T{A') C T(yl^). Let G be a timed graph {S' , p, sq, 
E, G, A, A) such that p{s) = cr, Z\(e) = 6 for any e = (s', s) G E, d = (s, Si) G E 
and A(ei) = Xi for i = 1, . . . , 7 if and only if (s, cr, si, . . . , Sk, Ai, . . . , Afe, S) G A' . 
Notice that, due to the properties of A', G is well defined. Denoted as vtg the clock 
valuation mapping each clock to 0, by the above construction we have that each 
(so, r'o)-run p of G is a continuous path of a timed tree {t,r) G T{A'), and on 
the other hand, for each (t, t) G T{A') any continuous path p' in {t, r) is also an 
(so, r'o)-run of G. Moreover, by Lemma 1 since T{A') C T(^^), for {t, t) G T{A') 
there is a timed tree (p, r) such that rj : dom{t) — > Sip x Stp and {v, d) \= ip 
for each ip G p.{v,d), where M’*’'" = (Sp,p,,f) is the dense tree corresponding 
to (p,r). Notice that p is independent by the choice of (Lt) G T{A'), since A' 
is highly deterministic. Thus, since A! is graph-representable, r\ defines in an 
obvious way a labelling function rf of the G vertices such that G \= ip for each 
Ip G p'(sq). Since (t,r) \= p, it holds that M’*’'", (v,d) ^ p and thus p G rj'{so). 
Hence G \= p, and p is finitely satisfiable. 

Directly from the last two lemmata we have the following theorem. 

Theorem 4. A Tctl- formula p is finitely satisfiable if and only if p has a 
left-closed right-open SxcxL-modeZ. 

As a consequence of the above results, the membership problem in Lpin is 
decidable in exponential time and can be reduced to the emptiness problem of 
timed tree automata. 
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Theorem 5. The finite satisfiability of Tctl- formulae is decidable in exponen- 
tial time. 

Proof. By Theorem 4, we have that ip is finitely satisfiable in Tctl if and only 
if ip has a left-closed right-open SxCTL-model. Thus by Lemma 1, the finite 
satisfiability of Tctl- formulae is decidable in exponential time. 

We end this section by proving that the set Lpin is a proper subset of Lstctl, 
where Lstctl is the language of the STCTL-satisfiable formulae. By Theorem 4 
we have that Lpin C Lstctl- The strict containment can be proved by showing 
that there exists a Formula p such that p is STCTL-satisfiable but is not finitely 
satisfiable. 

Example 1. Consider the formula p = Vn<cpAVD>c ~^p. Let {t, r) be an Stctl- 
structure such that (1) for any i < deg{e), tsingis) = topen{i) = tsing{i) = P and 
t{{) = c, and (2) tsingiv) = topen{v) = ~^p for any other v € dom(t). Clearly 
(t, r) is an STCTL-model of p, and thus we have that p G Lstctl- Moreover 
P ^ Lpin since truth assignments of a dense path in a timed graph vary on 
left-closed right-open intervals. 

Thus we have the following lemma. 

Lemma 4. Lpin is strictly contained in Lstctl - 

5 Conclusions 

In this paper we have proved the decidability of the finite satisfiability of the 
TCTL-formulae that do not contain the equality in the timing constraints. The 
result is obtained by reducing this problem to the emptiness problem for timed 
tree automata. The presented construction uses as intermediate step the decid- 
ability of formulae on left-closed right-open STCTL-structures. According to the 
previously known results there were two possible causes of the undecidability 
of TCTL-finite satisfiability: the denseness of the underlying structure and the 
equality in the timing constraints. Our results prove that the only source of un- 
decidability when the structures are defined by timed graphs is the presence of 
the equality in the timing constraints. We have also compared Tctl to Stctl, 
via the language of the formulae which are satisfiable in each of them. The inter- 
esting result we obtained is that the satisfiability problem in Tctl is decidable 
on a set of structures more general than those obtained from timed graphs. As 
a consequence there exists a more general formulation of dense trees with dense 
branching time that matches the language of formulae which are satisfiable in 
Stctl. Finally, we prove our results by relating to the theory of timed tree au- 
tomata, so strengthening the already well-founded connections between the field 
of logics and the field of finite automata. 
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Abstract. Equivalence between designs is a fundamental notion in verification. 
The linear and branching approaches to verification induce different notions of 
equivalence. When the designs are modeled by fair state-transition systems, equiv- 
alence in the linear paradigm corresponds to fair trace equivalence, and in the 
branching paradigm corresponds to fair bisimulation. 

In this work we study the expressive power of various types of fairness condi- 
tions. Eor the linear paradigm, it is known that the Blichi condition is sufficiently 
strong (that is, a fair system that uses Rahin or Streett fairness can be translated 
to an equivalent Biichi system). We show that in the branching paradigm the 
expressiveness hierarchy depends on the types of fair bisimulation one chooses 
to use. We consider three types of fair bisimulation studied in the literature: 3- 
hisimulation, game-bisimulation, and V-bisimulation. We show that while game- 
bisimulation and V-bisimulation have the same expressiveness hierarchy as tree 
automata, 3-bisimulation induces a different hierarchy. This hierarchy lies be- 
tween the hierarchies of word and tree automata, and it collapses at Rabin condi- 
tions of index one, and Streett conditions of index two. 



1 Introduction 

In formal verification, we check that a system is correct with respect to a desired behav- 
ior by checking that a mathematical model of the system satisfies a formal specification 
of the behavior. In a concurrent setting, the system under consideration is a composition 
of many components, giving rise to state spaces of exceedingly large size. One of the 
ways to cope with this state-explosion problem is abstraction [BCG88, CFJ93, BGOO]. 
By abstracting away parts of the system that are irrelevant for the specification being 
checked, we hope to end up with manageable state-spaces. Technically, abstraction may 
cause different states s and s' of the system to become equivalent. The abstract system 
then has as its state space the equivalence classes of the equivalence relation between 
the states. In particular, s and s' are merged into the same state. 

* Supported in part by NSF grants CCR-9700061 and CCR-9988322, and by a grant from the 
Intel Corporation. 
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We distinguish between two types of equivalence relations between states. In the 
linear approach, we require s and s' to agree on linear behaviors (i.e., properties satis- 
hed by all the computations that start in s and s'). In the branching approach, we require 
s and s' to agree on branching behaviors (i.e., properties satished by the computation 
trees whose roots are s and s'). When we model systems by state-transition systems, 
two states are equivalent in the linear approach iff they are trace equivalent, and they 
are equivalent in the branching approach iff they are bisimilar [Mil71]. The branching 
approach is stronger, in the sense that bisimulation implies trace equivalence but not 
vice versa [Mil71, Pnu85]. 

Of independent interest are the one-way versions of trace equivalence and bisim- 
ulation, namely trace containment and simulation. There, we want to make sure that 
s does not have more behaviors than s'. This corresponds to the basic notion of ver- 
ification, where an implementation cannot have more behaviors than its specification 
[AL91]. In the hierarchical refinement top-down methodology for design development, 
we start with a highly abstract specification, and we construct a sequence of “behavior 
descriptions”. Each description refers to its predecessor as a specification, and the last 
description is sufficiently concrete to constitute the implementation (cf. [LT87, Kur94]). 

The theory behind trace equivalence and bisimulation is well known. We know 
that two states are trace equivalent iff they agree on all LTL specifications, and the 
problem of deciding whether two states are trace equivalent is PSPACE-complete 
[MS72, KV98b]. In the branching approach, two states are bisimilar iff they agree on 
all CTL* formulas, which turned out to be equivalent to agreement on all CTL and 
/r-calculus formulas [BCG88, JW96]. The problem of deciding whether two states are 
bisimilar is PTIME-complete [Mil80, BGS92], and a witnessing relation for bisimula- 
tion can be computed using a symbolic fixpoint procedure [McM93, HHK95]. Similar 
results hold for trace containment and simulation. The computational advantage of sim- 
ulation makes it a useful precondition to trace containment [CPS93]. 

State-transition systems describe only the safe behaviors of systems. In order to 
model live behaviors, we have to augment systems with fairness conditions, which 
partition the infinite computations of a system into fair and unfair computations 
[MP92, Fra86]. It is not hard to extend the linear approach to account for fairness: s 
and s' are equivalent if every sequence of observations that is generated along a fair 
computation that starts in s can also be generated along a fair computation that starts in 
s' , and vice versa. Robustness with respect to LTL, and PSPACE-completeness extend 
to the fair case. It is less obvious how to generalize the branching approach to account 
for fairness. Several proposals for fair bisimulation can be found in the literature. We 
consider here three: 3-bisimulation [GL94], game-bisimulation [HKR97, HROO], and 
-bisimulation [LT87]. In a bisimulation relation between S and S' with no fairness, 
two related states s and s' agree on their observable variables, every successor of s is 
related to some successor of s' , and every successor of s' is related to some succes- 
sor of s. In all the definitions of fair bisimulation, we require related states to agree 
on their observable variables. In 3-bisimulation, we also require every fair computa- 
tion starting at s to have a related fair computation starting at s' , and vice versa. In 
game-bisimulation, the related fair computations should be generated by strategies that 
depend on the states visited so far, and in V-bisimulation, the relation is a bisimulation 
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in which related computations agree on their fairness (we review the formal definitions 
in Section 2). 

The different definitions induce different relations: V-hisimulation implies game- 
hisimulation, which implies 3-bisimulation, but the other direction does not hold 
[HKR97]. The difference in the distinguishing power of the definitions is also reflected 
in their logical characterization: while 3-bisimulation corresponds to fair-CTL* (that is, 
two systems are 3-bisimilar iff they agree on all fair-CTL* formulas, where path quan- 
tifiers range over fair computations only [CES86]), game-bisimulation corresponds to 
fair-alternation-free /i-calculus' . Thus, unlike the non-fair case, where almost all modal 
logics corresponds to bisimulation, here different relations correspond to different log- 
ics [ASB+94] Finally, the different definitions induce different computational costs. 
The exact complexity depends on the fairness condition being used. For the case of 
the Biichi fairness condition, for example, the problem of checking whether two sys- 
tems are bisimilar is PSPACF-complete for 3-bisimulation [KV98b], NP-complete for 
V-bisimulation [Hoj96], and PTIMF-complete for game-bisimulation [HKR97, HROO]. 

There are various types of fairness conditions with which we can augment labeled 
state-transition systems [MP92]. Our work here relates fair transition systems and au- 
tomata on infinite objects, and we use the types and names of fairness conditions that 
are common in the latter framework [Tho90]. The simplest condition is Biichi (also 
known as unconditional or impartial fairness), which specifies a set of states that should 
be visited infinitely often along fair computations. In its dual condition, co-Biichi, the 
specified set should be visited only finitely often. More involved are Streett (also known 
as strong fairness or compassion), Rabin (Streett’s dual), and parity conditions, which 
can restrict both the set of states visited infinitely often and the set of states visited 
finitely often. Rabin and parity conditions were introduced for automata and are less 
frequent in the context of state-transition systems. Rabin conditions were introduced by 
Rabin and were used to prove that the logic S2S is decidable [Rab69]. Parity conditions 
can be easily translated to both Rabin and Streett conditions. They have gained their 
popularity as they are suitable for modeling behaviors that are given by means of fixed- 
points [EJ91]. As we formally define in Section 2, Rabin, Streett, and parity conditions 
are characterized by their index, which is the number of pairs (in the case of Rabin and 
Streett) or sets (in the case of parity) they contain. When we talk about a type of a sys- 
tem, we refer to its fairness condition and, in the case of Rabin, Streett, and parity, also 
to its index. For example, a Rabin[l] system is a system whose fairness condition is a 
Rabin condition with a single pair. 

The relations between the various types of fairness conditions are well known in the 
linear paradigm. There, we can regard fair transition systems as a notational variant of 
automata on infinite words, and adopt known results about translations among the vari- 
ous types and about the complexity of the trace-equivalence and the trace-containment 
problems [Tho90]. In particular, it is known that the Biichi fairness condition is suffi- 
ciently strong, in the sense that every system can be translated to an equivalent Biichi 
system, where equivalence here means that the systems are trace equivalent. 

* A semantics of fair-alternation-free /r-calculus is given in [HROO]. 

^ As shown in [ASB+94], the logic CTL induces yet another definition, strictly weaker than 
3-bisimulation. Also, no logical characterization is known for V-bisimulation. 
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In the branching paradigm, tight complexity bounds are known for the fair- 
bisimulation problem with respect to the three dehnitions of fair bisimulation and the 
various types of fairness conditions [Hoj96, HKR97, KV98b], but nothing is known 
about their expressive power, and about the possibilities of translations among them. 
For example, it is not known whether every system can be translated to an equivalent 
Biichi system, where now equivalence means fair bisimulation. In particular, it is not 
clear whether one can directly apply results from the theory of automata on infinite 
trees in order to study fair-bisimulation, and whether the different dehnitions of fair 
bisimulation induce different expressiveness hierarchies. 

In this paper, we study the expressive power of the various types of fairness condi- 
tions in the context of fair bisimulation. For each of the three dehnitions of fair bisimu- 
lation, we consider the following question: given types 7 and 7 ' of fairness conditions, 
is it possible to translate every 7 -system to a fair-bisimilar 7 '-system? If this is indeed 
the case, we say that 7 ' is at least as strong as 7 . Then, 7 is stronger than 7 ' if 7 is 
at least as strong as 7 ', but 7 ' is not at least as strong as 7 . When 7 is stronger than 
7 ', we also say that 7 ' is weaker than 7 . We show that the expressiveness hierarchy for 
game-bisimulation and V-bisimulation is strict, and it coincides with the expressiveness 
hierarchy of tree automata. Thus, Biichi and co-Buchi systems are incomparable and are 
the weakest, and for alH > 1, Rabin[z + 1], Streett[z + 1], and parity[i + 1], are stronger 
than Rabin[z], Streett[i], and parity[z], respectively [Rab70, DJW97, Niw97, NW98]. 
In contrast, the expressiveness hierarchy for 3-bisimulation is different, and it is not 
strict. We show that Biichi and co-Biichi systems are incomparable, and they are both 
weaker than Streett[l] systems. Streett[l] systems are in turn weaker than Streett[2] 
and Rabin[l] systems, which are both at least as strong as Rabin[z] and Streett[z], for all 
z > 1. 

Our results imply that the different dehnitions of fair bisimulation induce differ- 
ent expressiveness relations between the various types of fairness conditions. These 
relations are different than those known for the linear paradigm, and, unlike the case 
there, they do not necessarily coincide with the relations that exist in the context of 
automata on inhnite trees. A decision of which fairness condition and which type of 
fair-bisimulation relation to use in a modeling and verihcatiuon process should take 
into an account all the characteristics of these types, and it cannot be assumed that what 
is well known for one type is true for another. 

Due to space limitations, most of the proofs are omitted. A full version can be found 
in the homepages of the authors. 



2 Definitions 

A fair state-transition system {system, for short) S = {E, W, R, Wq, L, a) consists of 
an alphabet E, a finite set W of states, a total transition relation R Q W x W (i.e., 
for every w & W there exists w' such that R{w, w')), a set Wq of initial states, a 
labeling function L : W ^ E, and a fairness condition a. We will define several types 
of fairness conditions shortly. A computation of S' is a sequence tt = wo,wi,W 2 , ■ ■ ■ 
of states such that for every z > 0, we have R{wi,Wi+i). Each computation tt = 
wo,wi,W 2 , ■ ■ ■ induces the word L{tt) = L{wq) ■ L{wi) ■ L{w 2 ) • • • G E‘^ . In order 
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to determine whether a computation is fair, we refer to the set m/(7r) of states that 
7T visits infinitely often. Formally, inf{Tr) = {w & W : for infinitely many z > 
0, we have Wi = w}. The way we refer to zn/(7r) depends on the fairness condition of 
S. Several types of fairness conditions are studied in the literature: 

- Biichi (unconditional or impartial), where a C W, and tt is fair iff zn/(7r) net ^ 0. 

- co-Buchi, where a C W , and tt is fair iff zn/(7r) n ct = 0. 

- Parity, where a is a partition of W, and tt is fair in a = {Fi, F 2 , Fk} if the 
minimal index z for which zn/(r) n Fz 7^ 0 exists and is even. 

- Rabin, where a C 2^ x 2^ , and tt is fair in ct = {(Gi,i3i), . . . , {Gk,Bk)} if 
there is a 1 < z < fc such that zzz/(7r) n Gi 7^ 0 and zzz/(7r) f] Bi = fb. 

- Streett (compassion or strong fairness), where a C 2^ x 2^, and tt is fair in 
a = {{Gi,Bi ), . . . , (Gfc, Bk)} if for all 1 < z < fc, we have that inf{n) n Gi 7^ 0 
implies inf{n) n Sz 7^ 0. 

The number k of sets in a parity fairness condition or of pairs in a Rabin or Streett 
fairness condition is the index of a. When we talk about the type of a system, we refer 
to its fairness condition and, in the case of Rabin, Streett, and parity, also to its index. For 
example, a Rabin[l] system is a system whose fairness condition is a Rabin condition 
with a single pair. For a state w, a zu-computation is a computation wq,wi,W 2 t ■ ■ with 
wq = w. We use T{S'^) to denote the set of all traces (Jq • cti • • • G for which there 
exists a fair zu-computation wq, wi, . . . in S with L(wi) = ai for all z > 0. The trace 
setT(S) of S is then defined as UmeiFo 

We now formalize what it means for two systems (or two states of the same 
system) to be equivalent. We give the definitions with respect to two systems S = 
(S, W, R, Wo) L, a) and S' = {B, W , R' , IFg, L' , a'), with the same alphabet.^ We 
consider two equivalence criteria: trace equivalence and bisimulation. While the first 
criterion is clear (T{S) = T{S')), several proposals are suggested in the literature for 
bisimulation in the case of systems with fairness. Before we define them, let us first 
recall the definition of bisimulation for the non-fair case. 

Bisimulation [Mil71] A relation Ff C W x W' is a bisimulation relation between S 
and S' iff the following conditions hold for all {w, w') G H. 

1. L(w) = L'(w'). 

2. For all s G FF with R(w, s), there is s' G W such that R'(w' , s') and H(s, s'). 

3. For all s' with R'{w' , s'), there is s G ly such that R{w, s) and H{s, s'). 

We now describe three extensions of bisimulation relations to the fair case. In all 
definitions, we extend a relation H CW x W' , over the states of S and S' , to a relation 
over infinite computations of S and S'\ for two computations tt = zz>o, zzii , . . . in S', and 
tt' = w'q, zz>^, . . . in S', we have iF(7r, tt') iff H(wi, w'f(, for all z > 0. 

3-bisimulation [GL94] A relation H CWx W' is an 3-bisimulation relation between 
S and S' iff the following conditions hold for all {w,w') G H. 

^ In practice, S and S' are given as systems over alphabets 2^^ and 2^^ , when AP and AP' 

are the sets of atomic propositions used in S and S', and possibly AP f AP'. When we 

compare S with S', we refer only to the common atomic propositions, thus E = . 
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1. L{w) = L'{w'). 

2. Each fair w-computations tt in S' has a fair w'-computation tt' in S' with H (tt, tt'). 

3. Each fair w' -computations tt' in S' has a fair w-computation tt in S with H (tt, tt'). 

Game bisimulation [HKR97, HROO] Game bisimulation is defined by means of a 
game between a protagonist against an adversary. The positions of the game are pairs 
inW X W' . A strategy r for the protagonist is a partial function from (W x W')* x 
{W U W) to {W U W), such that for all p & {W x W')* , w&W, and w' G W, we 
have that r(p ■ w) G W and r(p • w') G W . Thus, if the game so far has produced the 
sequence p of positions, and the adversary moves to w in S, then the strategy r instructs 
the protagonist to move to w' = ■ w), resulting in the new position {w, w'). If the 

adversary chooses to move to w' in S' , then r instructs the protagonist to move to 
w = r(7T • w'), resulting in the new position {w,w'). A sequence w = (wo,w'q) ■ 
(wi,w'i) • • • G {W X IE')“ is an outcome of the strategy r if for all z > 0, either 
= r((wo,Wo) •• • ■ w*+i), orwi+i = r((mo,Wo) •• • (w*,w') • zu-+i). 

A binary relation H C W x W' is a game bisimulation relation between S and S' 
if there exists a strategy r such that the following conditions hold for all (zu, w') in H. 

1. L{w) = L{w'). 

2. Every outcome w = {wq, w'q) ■ {wi,w'i) • • • of r with wq = w and w'q = w' has 
the following two properties: (1) for all z > 0, we have {wi, w'^ G H, and (2) the 
projection zuo • zui • • • of uJ to IE is a fair zuo -computation of S iff the projection 
zug • zu j • • • of u? to W' is a fair zug -computation of S' . 

V-bisimulation [LT87, DHW91] A binary relation H C W x W' is a \/ -bisimulation 
relation between S and S' if the following conditions hold: 

1. iT is a bisimulation relation between S and S' . 

2. If H{w,w'), then for every fair zz;-computation tt of S' and for every w'- 
computation tt' of S', if H{tt, tt'), then tt' is fair. 

3. If H{w,w'), then for every fair zu'-computation tt' of S' and for every w- 
computation tt of S, if H{tt, tt'), then tt is fair. 

It is not hard to see that if 77 is a V-bisimuIation relation, then H is also a game- 
bisimulation relation. Also, if 77 is a game-bisimulation relation, then 77 is also an 
3-bisimulation relation. As demonstrated in [HKR97], the other direction is not true. 

Eor all types (3 of bisimulation relations (that is /? G {3, ^ame, V}), a /3-bisimulation 
relation 77 is a fi-bisimulation between S and S' if for every w G lEg there exists 
w' G lEg such that H{w,w'), and for every w' G lEg there exists w G Wq such that 
77(zu, w'). If there is a /3-bisimulation between S and S', we say that S and S' are /3- 
bisimilar. Intuitively, bisimulation implies that S and S' have the same behaviors. For- 
mally, two bisimilar systems with no fairness agree on the satisfaction of all branching 
properties that can be specified in a conventional temporal logic (in particular, CTL* 
and /i-calculus) [BCG88, JW96]. When we add fairness, the logical characterization 
becomes less robust: 3-simulation corresponds to fair-CTL*, and game- simulation cor- 
responds to fair-alternation-free /z-calculus [ASB+94, GL94, HKR97, HROO]. 

For 3-bisimulation and V-bisimulation, a relation 77 C IE x W' is a (3-simulation 
relation from S to S' if conditions 1 and 2 for 77 being a /3-bisimulation relation hold. 
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For game-bisimulation, a relation H is a game-simulation relation from S to S' if we 
restrict the moves of the adversary to choose only states from S. A /3-simulation relation 
H is a (3-simulation from S to S' iff for every w € Wq there exists w' G FFq such that 
H{w, w'). If there is a /3-simulation from S to S' , we say that S' (3-simulates S, and 
we write S <j 3 S' . Intuitively, while bisimulation implies that S and S' have the same 
behaviors, simulation implies that S has less behaviors than S' . 

It is easy to see that bisimulation implies trace equivalence. The other direction, 
however, is not true [Mil71]. Hence, our equivalence criteria induce different equiva- 
lence relations. When attention is restricted to trace equivalence, it is known how to 
translate all fair systems to an equivalent Biichi system. In this paper we consider the 
problem of translations among systems that preserve bisimilarity. 



3 Expressiveness with 3 -Bisimulation 

In the linear case, it follows from automata theory that co-Biichi systems are weaker 
than Biichi systems, which are as strong as parity, Rabin, and Streett systems. In the 
branching case, nondeterministic Biichi and co-Biichi tree automata are both weaker 
than Rabin tree automata, and, for all i > 1, parity[z], Rabin[z], and Streett[z] are weaker 
than parity[i -F 1], Rabin[z -F 1], and Streett[z -F 1], respectively [Rab70, DJW97, Niw97, 
NW98]. In this section we show that the expressiveness hierarchy in the context of 
3-bisimulation is located between the hierarchies of word and tree automata."^ 

We first show that Biichi and co-Biichi systems are weak. The arguments we use are 
similar to these used by Rabin in the context of tree automata [Rab70]. Our proofs use 
the notion of maximal models [GL94, KV98c]. A system is a maximal model for 
an VCTL* formula ip if ^ ip and for every module M we have that M <3 
iff M \= Ip. It can be shown that there is no Biichi system that is 3-bisimilar to the 
maximal model of the formula VODp and that there is no co-Biichi system that is 3- 
bisimilar to the maximal model of the formula VDOp. Hence, we have: 

Theorem 1. Biichi is not at least as 3-strong as co-Biichi and co-Biichi is not at least 
as 3-strong as Biichi. 

Note that Theorem 1 implies that the Biichi condition is too weak for defining max- 
imal models for VCTL* formulas. On the other hand, the Biichi condition is sufficiently 
strong for defining maximal models for VCTL formulas [GL94, KV98a]. Since par- 
ity, Rabin, and Streett are at least as 3-strong as Biichi and co-Biichi, it follows from 
Theorem 1 that parity, Rabin, and Streett are all 3-stronger than Biichi and co-Biichi. 

So far things seem to be very similar to tree automata, where Biichi and co-Biichi 
conditions are incomparable [Rab70]. In particular, the ability of the Biichi condition 
to define maximal models for VCTL and its inability to define maximal models for 
VCTL* seems related to the ability to translate CTL formulas to Biichi tree automata 
and the inability to translate CTL* formulas to Biichi tree automata (as follows from 
Rabin’s result [Rab70]). In tree automata, the hierarchy of expressive power stays strict 

Here and in the sequel, we use terms like 7 is 3-stronger than 7 ' to indicate that 7 is stronger 
than 7 ' in the context of 3-bisimulation. 
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also when we proceed to parity (or Rabin or Streett) fairness condition with increasing 
indices [DJW97, Niw97, NW98]. We now show that, surprisingly, in the context of 3- 
bisimulation, Rabin conditions of index one are at least as strong as parity, Rabin, and 
Streett conditions with an unbounded index. In particular, it follows that maximal mod- 
els for VCTL* can be defined with Rabin[l] fairness. The idea behind the construction 
is similar to the conversion of Rabin and Streett automata on infinite words to Biichi 
automata on infinite words. 

Lemma 1. Every Rabin system with n states and index k has an 3-bisimilar Rabin 
system with 0(nk) states and index 1. 

Proof: Let S = (£■, VL, Wq, i?, L, a) be a Rabin system with a = 

{(Gi,Bi),...,(Gfc,Bfc)}. We define S' = {E ,W R' , L' ,a') as follows. 

- For every 1 < i < fc, let Wi = (W \ Bi) x {i}. Then, W' = {W x {0}) U 
Ui<,<feVF.,andlL' = M^ox{0}. 

- R' = Uo<»<fc{((^’0),(w',i)),((w,i),(w',0)),((w,i),(w',i)) : {w,w') G R} n 
(W X W). Note that R' is total. 

- For aWw GW and 0 < i < k,ws have L'{{w, z)) = L{w). 

- «' = {(Ui<i<feG. x{z},VFx{0})}. 

Thus, S' consists of A: 3- 1 copies of S. One copy (“the idle copy”) contains all the 
states in W , marked with 0. Then, k copies are partial: every such copy is associated 
with a pair (Gi ,Bi), its states are marked with z, and it contains all the states in W\Bi. 
A computation of S' can return to the idle copy from all copies, where it can choose 
between staying in the idle copy or moving to one of the other k copies. The acceptance 
condition forces a fair computation to visit the idle copy only finitely often, forcing the 
computation to eventually get trapped in a copy associated with some pair (Gj, Bi). 
There, the computation cannot visit states from Bi (indeed, Wi does not contain such 
states), and it has to visit infinitely many states from Gi. It is not hard to see that the 
relation H = {{w, {w,i)) '■ w G W and 0 < z < fc} is an 3-bisimulation between S 
and S' , thus S and S' are 3-bisimilar. □ 

In the case of transforming Rabin[/c] word automata to Rabin[l] (or Biichi) au- 
tomata, runs of the automaton on different computations are independent of each other, 
so there is no need for the automaton to “change its mind” about the pair in a with 
respect to which the computation is fair. Accordingly, there is no need to return to an 
idle copy. In the case of tree automata, runs on different computations of the tree de- 
pend on each other, and the run of the automaton along a computation may need to 
postpone its choice of a suitable pair in a ad infinitum, which cannot be captured with a 
Rabin[l] condition. The crucial observation about 3-bisimulation is that here, if tti and 
7T2 are different fair zu-computations, then the fair computations and tt^ for which 
and tt^) are independent. Thus, each computation eventually reaches 

a state where it can stick to its suitable pair in a. Accordingly, a computation needs to 
change its mind only finitely often. A visit to the idle copy corresponds to the computa- 
tion changing its mind, and the fairness condition guarantees that there are only finitely 
many visits to the idle copy. 
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We now describe a similar transformation for Streett systems. While in Rabin sys- 
tems each copy of the original system corresponds to a guess of a pair (Gj, Bi) for 
which Gi is visited infinitely often and Bi is visited only finitely often, here each copy 
would correspond to a subset I C {1, . . . , fc} of pairs, where the copy associated with 
I corresponds to a guess that Bi and Gi are visited infinitely often for all i G I, and Gi 
is visited only finitely often for all i ^ I. 

Lemma 2. Every Streett system with n states and index k has an 3-bisimilar Rabin 
system with 0{n ■ states and index 1. 

Note that while the blow up in the construction in Lemma 1 is linear in the index 
of the Rabin system, the blow up in the construction in Lemma 2 is exponential in 
the index of the Streett system. The above blow ups are tight for the linear paradigm 
[SV89]^. Since 3-bisimulation implies trace equivalence, it follows that these blow ups 
are tight also for the 3-bisimulation case. 

Since the parity condition is a special case of Rabin, Lemma 1 also implies a trans- 
lation of parity systems to 3-bisimilar Rabin[l] systems. Also, a Rabin[l] condition 
{(G, B)} can be viewed as a parity condition {B, G \ FL \ (G U B)}. Hence, par- 
ity[3] is as 3-strong as Rabin[l] A Rabin[l] condition {(G, B)} is equivalent to the 
Streett[2] condition {{W, G), {B, 0)}. So, Streett[2] is also as 3-strong as Rabin[l]. It 
turns out that we can combine the arguments for Btichi and co-Buchi in Theorem 1 to 
prove that Streett[l] is 3-weaker than Streett[2]. To sum up, we have the following. 

Theorem 2. For every fairness type 7, the types Rabin[l ], Streett[2 ], and parity [ 3 ] are 
all at least as 3-strong as 7. 

Note that the types described in Theorem 2 are tight, in the sense that, as discussed 
above, Birchi, co-Buchi, Streett[l], and parity[2] may be 3-weaker than 7. 

In the full version, we also show that a system with a generalized Biichi condition 
or with a justice condition [MP92] can be translated to an 3-bisimilar Btichi system, 
implying that generalized Biichi and justice conditions are also too weak. 

4 Expressiveness with Game-Bisimulation and V-Bisimulation 

We now study the expressiveness hierarchy for game-bisimulation and V-bisimulation. 
We show that unlike 3-simulation, here the hierarchy coincides with the hierarchy of 
tree automata. Thus, Rabin[i+1] is stronger than Rabin[i], and similarly for Streett and 
parity. In order to do so, we define game-bisimulation between tree automata, and define 
transformations preserving game-bisimulation between tree automata and fair systems. 
We show that game-bisimilar tree automata agree on their languages (of trees), which 
enables us to relate the expressiveness hierarchies in the two frameworks. 

^ [SV89] shows that the transition from Streett word automata to Btichi word automata is ex- 
ponential in the index of the Streett automaton. Since the transition from Rabin[l] to Biichi is 
linear, a lower bound for the transition from Streett to Rabin[l] follows. 

^ Recall that a parity fairness condition is a partition of the state set. Hence, a parity[2] condition 
can be translated to an equivalent co-Buchi fairness condition and vice versa, implying that 
Rabin[l] is 3-stronger than parity[2]. 
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Due to lack of space we only give an outline of the proof. We define a special 
type of tree automata, called loose tree automata. Unlike conventional tree automata 
[Tho90], the transition function of loose tree automata does not distinguish between the 
successors of a node, it does not force states to be visited, and it only restricts the set 
of states that each of the successors may visit. When A runs on a labeled tree (T, V) 
and it visits a node x with label a at state q, then 6{q, a) = S (where S' is a subset of 
the states of A) means that A should send to all the successors of x copies in states in 
S. Loose tree automata can use all types of fairness. A run of a loose tree automaton is 
accepting if all the infinite paths of the run tree satisfy the fairness condition. 

We can define game-bisimulation for loose tree automata. Given two loose tree au- 
tomata, we define a game whose positions are pairs of states. A strategy for the game is 
similar to the strategy defined for systems, but this time the adversary gets to choose an 
alphabet letter and a successor corresponding to this letter. The protagonist has to follow 
with a successor corresponding to the same letter in the other automaton. A relation is 
a game-bisimulation relation if all the outcomes of such plays starting at related states 
have both projections fair or have both projections unfair. Two loose tree automata are 
game-bisimilar if there exists a game-bisimulation between them that relates the starting 
states of each one of the automata to starting states of the other. 

Recall that game-bisimulation between systems implies trace equivalence. Game- 
bisimulation between loose tree automata implies not only agreement on traces that may 
label paths of accepted trees, but also agreement on the accepted trees! The idea is that 
given an accepting run tree of one automaton, we use the strategy to build an accepting 
run tree of its game-bisimilar counterpart. This property of game-bisimulation between 
loose tree automata enables us to relate the hierarchy of loose tree automata with that 
of game-bisimulation. Formally, we have the following. 

Theorem 3. Let 7 and be two types of fairness conditions. If^ is at least as strong 
as 7' in the context of game-bisimulation or \/ -bisimulation, then 7 is at least as strong 
as 7' also in the context of loose tree automata. 

While loose tree automata are weaker than conventional tree automata [Tho90],the 
expressiveness hierarchy of loose tree automata coincides with that of tree automata 
(this is beacause the latter coincides with the hierarchy of deterministic word automata 
[Wag79, Kam85], and is proven in [KSV96, DJW97, Niw97, NW98] by means of lan- 
guages that can be recognized by loose tree automata). It follows that the expressiveness 
hierarchy in the context of game-bisimulation and V-bisimulation coincides with that of 
tree automata. 

5 Discussion 

We considered two equivalence criteria — bisimulation and trace equivalence — be- 
tween fair state-transition systems. We studied the expressive power of various fairness 
conditions in the context of fair bisimulation. We showed that while the hierarchy in 
the context of trace equivalence coincides with the one of nondeterministic word au- 
tomata, the hierarchy in the context of bisimulation depends on the exact definition of 
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fair bisimulation, and it does not necessarily coincide with the hierarchy of tree au- 
tomata. In particular, we showed that Rabin[l] systems are sufficiently strong to model 
all systems up to 3-bisimilarity. 

There is an intermediate equivalence criterion; two-way simulation (that is S' < S" 
and S' < S) is implied by bisimulation, it implies trace equivalence, and it is equal 
to neither of the two [Mil71]. Two-way simulation is a useful criterion: S and S' are 
two-way similar iff for every system S" we have S" < S iff S" < S' and S < S" 
iff S' < S" . Hence, in hierarchical refinement, or when defining maximal models for 
universal formulas, we can replace S with S'. A careful reading through our proofs 
shows that all the results described in the paper for bisimulation hold also for two-way 
simulation. 

Finally, the study of 3-bisimulation in Section 3 has led to a simple definition of 
parallel compositions for Rabin and parity systems, required for modular verification 
of concurrent systems. In the linear paradigm, the composition S = S'i||S '2 of and 
S 2 is defined so that T{S) =T(S'i)nT (S'2) (cf. [Kur94]). In the branching paradigm 
[GL94], Grumberg and Long defined the parallel compositions of two Streett systems. 
As studied in [GL94, KV98a], in order to be used in modular verification, a definition 
of composition has to satisfy the following two conditions, for all systems S, S', and 
S". First, if S' <3 S", then S\\S' <a S\\S". Second, S <a S'\\S" iff S <a S' and 
S <3 S'". In particular, it follows that S||S' <3 S', thus every universal formula that is 
satisfied by a component of a parallel composition, is satisfied also by the composition. 
When Si and S 2 are Streett systems, the definition of S 1 HS 2 is straightforward, and 
is similar to the product of two Streett word automata. When, however. Si and S 2 are 
Rabin systems, the definition of product of word automata cannot be applied, and a 
definition that follows the ideas behind a product of tree automata is very complicated 
and complex. In the full paper we show that the fact that 3-bisimulation is located 
between word and tree automata enables a simple definition of parallel composition 
that obeys the two conditions above. 
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Abstract. This paper addresses the problems of counting proof trees 
(as introduced by Venkateswaran and Tompa) and counting proof cir- 
cuits, a related but seemingly more natural question. These problems 
lead to a common generalization of straight-line programs which we call 
polynomial replacement systems. We contribute a classification of these 
systems and we investigate their complexity. Diverse problems falling 
in the scope of this study include, for example, counting proof circuits, 
and evaluating {U, -|-}-circuits over the natural numbers. The former is 
shown ^P-complete, the latter to be equivalent to a particular problem 
for replacement systems. 



1 Introduction 

1.1 Motivation 

When -I- and x replace V and A in the adjacent 
figure, the gate gi on input xi = X 2 = ^ evaluates 
to 9. Equivalently, the tree-like Boolean circuit T 
obtained from the circuit drawn has 9 proof trees 
[VT89], i.e. 9 different minimal subcircuits witness- 
ing that T outputs 1 (gates replicated to form T are 
independent). This relationship between proof tree 
counting and monotone arithmetic circuits was used 
by Venkateswaran [Ven92] to characterize nonde- 
terministic time classes, including ffP [Val79], and 
by Vinay [Vin91] to characterize the counting version of LOGCFL [Sud78]. 
The same relationship triggered the investigation of by Caussinus et al. 

[CMTV98], and that of by Allender et al. [AAD97]. See [A1198] for recent 

results and for motivation to study such “small” arithmetic classes. 

A recent goal has been to capture small arithmetic classes by counting ob- 
jects other than proof trees, notably paths in graphs. Allender et al. [AAB+99] 
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succeeded in identifying appropriate graphs for ^AC°. Given the growing im- 
portance of counting classes, our motivation for the present work was the desire 
to avoid unwinding circuits into trees before counting their “proofs” . Define a 
proof circuit to be a minimal subcircuit witnessing that a circuit outputs 1 . 
More precisely, for a Boolean circuit C and an input x, a proof circuit is an 
edge-induced connected subcircuit of C which evaluates to 1 on x. This subcir- 
cuit must contain the output gate of C, as well as exactly one C-edge into each 
V-gate and all C-edges into each A-gate. The reader should convince herself that 
the circuit depicted above, which had 9 proof trees on input x\ = X 2 = 1, has 
only 7 proof circuits on that input. 

What counting classes arise from counting proof circuits instead of trees? 
This question held in stock two surprises, the first of which is the following 
algorithm: 

1 . replace V by -I- and A by x in a negation-free Boolean circuit C, 

2. view C as a straight-line program prescribing in the usual way a formal 
polynomial in the input variables xi, . . . , Xn, 

3. compute the polynomial top-down, with an important proviso: at each step, 
knock any nontrivial exponent down to 1 in the intermediate sum-of-monomi- 
als representation. 

We get the number of proof circuits of C on an input x by evaluating the final 
polynomial at x\ For example, the circuit depicted above had 7 proof circuits on 
input Xi = X 2 = I because 

5i ^ 9293 

{xi + 34)53 

^ (a:i + 54)(54 + 3:2) ^ xig^ + X1X2 + 93 + 93X2 

-I- X 2 ) + X\X2 + {x\ + X 2 ) + {x\ + X2)x2, 

where 54 became 54 in the middle of step 3. 

One’s intuition might be that such a simple strategy could be massaged into 
an arithmetic circuit or at least into a sublinear parallel algorithm [VSBR83]. 
Our second surprise was that counting proof circuits, even for depth-4 semi- 
unbounded circuits, is ^P-complete. Hence, not only is our strategy hard to 
parallelize, it likely genuinely requires exponential time! 

Our three-step algorithm above thus counts proof trees in the absence of 
the idempotent rules y, and it counts proof circuits in their presence. 

Moreover, whereas an arithmetic circuit computing the number of proof trees of 
a circuit is readily available, producing such a circuit to compute proof circuits 
seems intractable. What is special about the idempotent rules? What would the 
effect of multivariate rules be? Which nontrivial rules would nonetheless permit 
expressing the counting in the form of an arithmetic circuit? What is a general 
framework in which complexity questions such as these can be investigated? 



( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 
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1.2 Results 

We view our results as forming three main contributions. 

Our first contribution is to define and classify polynomial replacement sys- 
tems (prs for short). Prs provide the answer to the framework question. A prs 
in its full generality is a start polynomial q € N[xi,... ,Xm] together with a 
set of replacement rules. A replacement rule is a pair of polynomials (pi,p 2 )- 
Informally, (pi,P 2 ) is applicable to a polynomial q ii q can be written in a form 
in which pi appears. Applying (pi,P 2 ) to q then consists of replacing pi by p 2 
(see Sect. 3 for formal definitions). 

A prs generally defines a set of polynomials, since the choice and sequencing 
of the rules, and the way in which the rules are applied, may generate different 
polynomials. Computational problems of interest include computing the polyno- 
mials themselves (poly), evaluating the polynomials at specific points (eval), 
and testing membership in their ranges (range). We identify four natural fami- 
lies of prs: simple if the rules only replace variables, deterministic if no two rules 
have the same left-hand side, acyclic if no nontrivial infinite sequence of rules is 
applicable, and idempotent if the rules (y^,y) are present. 

For general prs, we obtain canonical forms and we outline broad complexity 
issues. Our detailed complexity analysis involves simple prs. For instance, we 
exhibit simple and deterministic prs for which RANGE is NP-complete. When 
the prs is given as part of the input, poly is P-hard and in coRP, while range 
is NP-complete and eval is P-complete. 

Our second contribution concerns the specific case of proof trees and proof 
circuits. We prove that, to any Boolean circuit C and input x, corresponds an 
easily computable idempotent, simple, deterministic and acyclic prs S having 
the property that the number of proof trees (resp. proof circuits) of (7 on a; is 
the maximum (resp. minimum) value of the eval problem for S on x, and vice 
versa (see Lemma 20) . This offers one viewpoint on the reason why our algorithm 
from Subsect. 1.1 counts proof circuits correctly. We also prove that computing 
the minimum of the eval problem for idempotent, simple, deterministic and 
acyclic prs is #P-complete, or equivalently, that counting proof circuits is #P- 
complete under Turing reductions (but not under many-one reductions unless 
P = NP). This provides a new characterization of #P which is to be contrasted 
with Venkateswaran’s (poly-degree, poly-depth) characterization [Ven92] and 
with the retarded polynomials characterization of Babai and Fortnow [BF91]. 
We also prove that detecting whether a circuit has more proof trees than proof 
circuits is NP-complete. 

Our third contribution concerns the specific case of simple and acyclic prs. 
We prove that the eval problem for such prs is the evaluation problem for 
{U, -b, x}-circuits. These circuits have been considered previously (under the 
name hierarchical descriptions) in [Wag84, Wag86]. They are obtained by gen- 
eralizing, from trees to general circuits, the {U,-|-, x }-expressions (a.k.a. integer 
expressions), whose evaluation problem was shown NP-complete 25 years ago by 
Stockmeyer and Meyer [SM73]. From a PSPACE upper bound given in [Wag84] 
we conclude that evaluation of simple acyclic prs has a polynomial space algo- 
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rithm, and from a PSPACE-hardness result given in [YanOO] we then conclude 
PSPACE-completeness of our problem. 

1.3 Paper Organization 

The main result of Sect. 2, in which proof trees and proof circuits are defined 
formally, is that counting proof circuits is ^P-complete. Section 3 introduces 
polynomial replacement systems and their canonical form and defines the rele- 
vant computational problems. Section 4 classifies prs and links them to arith- 
metic circuit problems. Section 5 contains the bulk of our complexity results, 
Section 6 concludes. For lack of space, formal proofs of all claims made in this 
abstract have to be omitted, but can be found in ftp://ftp-info4.informatik.uni- 
wuerzburg.de/pub/ftp/TRs/mc- vo-waOO.ps.gz. 

2 Counting Circuits vs. Counting Trees 

By a circuit C, in this paper, we will mean a circuit over the basis {A, V} in the 
usual sense, with 2n inputs labeled x\,. . . ,Xn, ~^x \, . . . , 

Fix an input x to C. Unwind C into a tree C by (repeatedly) duplicating 
gates with fan-out greater than 1. Define a proof tree as a subgraph H of C' 
whose gates evaluate to 1 and which additionally fulfills the following properties: 
H must contain the output gate of C. For every A gate v in H, all the input 
wires of v must be in H, and for every V gate v in H, exactly one input wire 
of V must be in H. Only wires and nodes obtained in this way belong to H. By 
ffC{x) we denote the number of proof trees of C. Define a proof circuit as a 
subcircuit H of C with the same properties as above. (I.e., the only difference 
is that now we do not start by unwinding C into a tree.) Given an input x, let 
ffcC{x) denote the number of proof circuits of C on x. We will consider the 
following problems: 

Problem: PT 

Input: circuit C over {A, V}, an input x G {0, 1}*, a number k in unary 

Output: ffC{x) mod 2^ 

Problem: PC 

Input: circuit C over {A,V}, an input x G {0, 1}* 

Output: ffcC{x) 

Observe that if we unwind a circuit into a tree there may be an exponential 
blowup in size, which has the consequence that the number of proof trees may 
be doubly-exponential in the size of the original circuit. This is not possible 
for the problem PC; the values of this function can be at most exponential in 
the input length. In order to achieve a fair comparison of the complexity of the 
problems, we therefore count proof trees only modulo an exponential number. 

Theorem 1. 1. PC is complete for ffP under under <(/ unless 

P = NP. 
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2. PT is complete for FP under 

3. The following problem is -complete under Given a circuit C, is there 
an input x such that ffC{x) ffcC{x) ? 

4- The following problem is P-complete under <m®.' Given a circuit C and an 
input X G {0, 1}*, is ffC{x) ^ ffcC{x) ? 

3 How to Generate Polynomials 

A straight line program P over variables xi,. . . ,Xm is a set of instructions of 
one of the following types: Xi <— Xj Xk, Xi <— Xj ■ Xk, Xi ^ 0, Xi ^ 1, where 
j, k < i. Every variable appears at most once on the left hand side of the . 
Those variables that never appear on the left hand side of the <— are the input 
variables. The variable Xm is the output variable. Given values for the input 
variables, the values of all other variables are computed in the obvious way. 
The value computed by P is the value of the output variable. Let pp be the 
number-theoretic function computed in this way by P. 

A straight line program hence is just another way of looking at an arithmetic 
circuit. The connection between counting proof trees and evaluating arithmetic 
circuits yields an obvious algorithm to compute the number of proof trees of a 
circuit C on input x: evaluate the straight line program obtained from C in the 
order of its variables, and plug in x. To compute the number of proof circuits 
instead, a mere variant of this algorithm was sketched in Sect. 1.1: do as for 
proof trees, but at each replacement step, express the intermediate polynomial 
as a sum of monomials and replace any occurrence of by (/, for any variable 
9 - 

Theorem 2. The algorithm sketched in Sect. 1.1 correctly computes the number 
of proof circuits of a circuit C on input x. 

Both the proof tree and the proof circuit counting algorithms prescribe a 
unique intermediate formal polynomial in the circuit input variables. These al- 
gorithms originate from special types of polynomial replacement systems, which 
we now define. Polynomial replacement systems will produce sets of polynomials 
from a given start polynomial, using rules replacing certain polynomials by other 
polynomials. This will be very similar to the way formal grammars produce sets 
of words from a start symbol, applying production rules. 

In this paper we almost exclusively consider polynomials with nonnegative 
integer coefficients. This is motivated by the application to proof trees and proof 
circuits discussed above. We write p{zi, . . . ,Zg) to denote that p is such a poly- 
nomial in variables Z\, . . . , Zs. 

Below, the variable vector x will always be defined to consist oix= {x\,. . . , 
Xm). Let us say that the variable Xi is fictive (or, inessential) in the polyno- 
mial p{x) if for all oi, . . . , am, € N we have p{ai , . . . , a^-i, ai, a^+i, . . . , am) = 
p{ai, . . . , Gi-i, o', tti+i, . . . , Gm)- This means that Xi is fictive in p if and only if 
p can be written as a term in which Xi does not appear. 
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Definition 3. A polynomial replacement system (for short: prs ) is defined as 
a quadruple S = ({a;i, . • . , x„}, , Xm}, q, R) where 

— {a;i, . . . ,Xn} is the set 0/ terminal variables, 

— {xn+i, ■ ■ • ,Xm} is the set 0/ nonterminal variables, 

— q is a polynomial in the variables x\,. . . ,Xm, the start polynomial, and 

— R is a finite set 0/ replacement rules, i. e., a finite set of pairs of polynomials 
in the variables xi, . . . , Xm- 



How does such a system generate polynomials? 



Definition 4. Let S = {{xi, . . . ,Xn],{xn+i, ■ ■ ■ ,Xm},q,R) be a prs, let pi,p 2 



be polynomials in the variables x. 

Pi => p 2 <l=^def there exist {p^,pa) G R o,nd a polynomial pz(x,y) such that 
pi{x) = P5ix,p3(x)) and p 2 {x) = p3{x,pi{x)). 



Let be the reflexive and transitive closure of . 

It turns out that the above form for derivations can be simplified: 
Definition 5. Let S,pi,p 2 be as above. 

Pi P 2 there exist {p 3 ,pa) G R and polynomials p^fx) , pe(x) such that 

pi(x) = P5(x) ■ pslx) + pe(x) and p 2 {x) = pflx) • pa{x) + pe{x). 

Let be the reflexive and transitive closure af 



Lemma 6 (Normal Form of Replacement). For any prs S = {flx\, . . . , Xn}, 
{xn+i, ■ ■ ■ ,Xm},q, R) and any polynomials pi{x),p 2 {x), we have: 

Pi iff Pi ^P2- 

A prs thus generates a set of polynomials; hence we define: 

Definition 7. For a prs S = ({xi, . . . , Xn}, {xn+i, ■ ■ ■ , Xm}, q, R), let 

poly(S') = {p{xi, . . . , Xn) I there exists p'{x) such that q p' and 

p{xi, ...,Xn)= p'{xi, ...,Xn, 0„+i, . . . , a„) 
for all OnA-l^ ■ • ■ 5 G Hj" . 

To determine the complexity of the sets poly(-), we have to fix an encoding of 
polynomials. We choose to represent polynomials by straight-line programs (as, 
e.g., in [IM83, Kal88]), and state our result below for this particular representa- 
tion. Other representations have been considered in the literature (e.g., formula 
representation, different sparse representations where a polynomial is given as a 
sequence of monomials, etc.). We remark that our results remain valid for most 
of these, as we will prove in the full version of this paper. 

From the set poly(S') of polynomials we derive several sets of natural num- 
bers, whose complexities we will determine in the upcoming sections. 
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Definition 8. Let S = ({a;i, . • . , x„}, {xn+i, • ■ • , Xm}, q, R) be a prs. Define 

— range(S') =def {p{a) I p G poly(S') a a G }; 

— eval(S') =def { {cL,p{a)) I p G poly(S') and a G N™ }. 

Observe that if we also allow negative numbers as coefficients for our poly- 
nomials, then there are prs S such that RANGe(S') is not decidable. This is 
seen as follows. By the Robinson-Matiasjevic result (see [Mat93]), every re- 
cursively enumerable set can be represented in the form { p{a) | a G N" } 
where p is a suitable n-ary polynomial with integer coefficients. Now let p 
be such an n-ary polynomial such that {p{a) | a G N" } is not decidable. 
Defining the prs Sp =def ({a^i, • ■ • , a^n}, 0,P, 0) we obtain POLY(S'p) = {p} and 
range(S'p) = {p{a) I o G hr }. 

Besides the membership problems poly(S'), range(S'), and eval(S'), we also 
consider the corresponding variable membership problems. 

Definition 9. — poly(-) =def { (S,p) \ S prs and p G poly(S') }; 

— range(-) =def { (S,a) I S prs and a G range(S') }; 

— eval(-) =def { (S,a,p{a)) | S prs,p G poly(S'), and a G N* }. 

4 Different Types of Replacement Systems 

Prs are very general. Here, we introduce a number of natural restrictions. Our 
approach is similar to the way different restrictions of grammar types were in- 
troduced, e.g., in the definition of the classes of the Chomsky hierarchy. We 
will later view the problems of counting proof trees and proof circuits as two 
instances of a problem about these restricted prs types. 

4.1 Simple Polynomial Replacement Systems 

Definition 10. A prs S = {{x\,. . . ^Xn},{xn+iT ■ ■ ,Xm},q,R) is simple (or 
context-free^, if the polynomials in the left-hand sides of the rules of R are 
variables from {xn-i-i, • ■ • , Xm}- 

All definitions made in the preceding section for general prs carry over to the 
special cases of simple systems. However, for simple prs we additionally define a 
particular type of replacement, where the application of a rule (z, q) results in 
the replacement of all occurrences of z with q. This latter form is denoted by 
1=^, in contrast to the notation for the derivations defined so far. Formally: 

Definition 11. Let S = ({a;i, . . . , a;„}, {a;n+i, . ■ . , Xm}, q, R) be a simple prs. 

Pi P 2 <^=^def there exist {xi,pfi) G R such that 

P 2 {x) = Pl{xi, . . . ,X^-i,Pz{x),Xi+i,. . .,Xm)- 

Let be the reflexive and transitive closure of |=^ . 

For the sets of polynomials and numbers derived from simple systems using 
our new derivation type, we use the same names as before but now use square 
brackets [• • • ] instead of parentheses (•••); e.g., poly[S'], poly[-], etc. 
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4.2 Simple Deterministic or Acyclic Polynomial Replacement 
Systems 

Definition 12. A prs S = [{xi, ... ,Xn}, {xn+i, ■■■ ,Xm},q,R) is said to be 
deterministic, if no two different rules in R have the same left-hand side. 



Definition 13. Let S = ({xi, . . . , Xn}, {xn-t-i, ■ ■ ■ , Xm}, q, R) be a prs. The de- 
pendency graph Gs of S is the directed graph Gs = ({1, . . . , m}, Es), where Es 
consists of all edges (j,i) for which there exists a rule (pi,P 2 ) G R such that Xi 
is essential in p\ and Xj is essential in p 2 . The prs S is said to be acyclic, if its 
dependency graph Gs is acyclic. 



Lemma 14. For every simple and deterministic prs S, there exists a simple, 
deterministic, and acyclic prs S' , computable in polynomial time, such that 
poly(S') = poly(S") and poly[S'] = poly[S"]. 

We also obtain the following easy properties: 

Lemma 15. 1. If S is a simple and deterministic prs then poly(S') = poly[S'], 

and this set consists of at most one polynomial. 

2. If S is a simple and acyclic prs then poly(S') and poly[S'] are finite. 

Note that there are simple and acyclic prs S such that poly[S'] C poly(S'). 
For example take S = {{x},{z},2z,{{z,x),{z,2x)}) where poly[S'] = {2x,-ix} 
and poly(S') = {2a;, 3a:, 4a;}. Thus, the requirement that S is deterministic is 
necessary in Lemma 15.1. 

In the remainder of this subsection, we relate simple deterministic and simple 
acyclic prs to different forms of circuits operating over the natural numbers. 

First, it is intuitively clear that there is some connection between simple, de- 
terministic, and acyclic systems and straight-line programs. This is made precise 
in the following lemma. 

Lemma 16. 1. If S is a simple, deterministic, and acyclic prs and poly(S') yf 

0, then there exists a sip P, computable in logarithmic space, such that 
poly(S') = {pp). 

2. If P is a sip then there exists a simple, deterministic, and acyclic prs S, 
computable in logarithmic space, such that {pp} = poly(S'). 

Next we show that acyclic systems are strongly related to a certain type of 
arithmetic circuit we now define. These circuits are immediate generalizations 
of integer expressions, introduced by Stockmeyer and Meyer [SM73]. Therefore 
we call our circuits integer circuits (not to be confused with ordinary arithmetic 
circuits), or, referring to the operations allowed, (U,-|-, x)-circuits. 

An integer circuit with n inputs is a circuit G where the inner nodes compute 
one of the operations U, -k, x. Such a circuit C has a specified output gate gg. 
It computes a function /c : bF ^ 2^ as follows: We first define for every gate 
g G G the function fg computed by g. 
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1. If (/ is an input gate Xi, then fg{a \, . . . , a„) = {oi} for all oi, . . . , a„ G N. 

2. If (/ is + gate with predecessors gi,gr, then fg{ai, . . . ,a„) = {k + m | k G 
fgi (ai, . . . , an),m G fgr{<ii, ■ ■ ■ , dn) }• The function computed by a x gate is 
defined analogously. 

3. If (/ is a U gate with predecessors gi,gr, then /g(oi, . . . , a„) = 

fgi (®1 > • ■ ■ ) dn) U fg^ (oi , . . . , On). 

Finally, the function computed by C is fc = fg,- 

The following relation between simple, acyclic replacement systems and in- 
teger circuits is obtained by an easy induction: 

Lemma 17. 1. For every simple, acyclic prs S, there is an integer circuit C 

with n inputs, computable in logarithmic space, such that fc{d) = {b | 
(a, 6) G eval(S') } for all a gN^ . 

2. For every integer circuit C with n inputs, there is a simple, acyclic prs S, 
computable in logarithmic space, such that {5 | (a,b) G eval(S') } = fc{d) 
for all a GW . 

We consider the following problems: 

N-MEMBER(U, -b, x) =def { (C,a,b) | C is an integer circuit with n inputs, 

a G rr,6 G N and 6 G fc{d) } 

N-RANGE(U, -b, x) =def { (C*, 6) | C is an integer circuit with n inputs, 

6 G N and (3a GW‘)bG fc(d) } 

Analogous notations will be used when we restrict the gate types allowed. 

The following lemma is immediate from Lemma 17: 

Lemma 18. 1. N-MEMBER(U, -b, x) eval(-). 

2. N-RANGE(U,-b, x) range(-). 

4.3 Idempotent Polynomial Replacement Systems 

Definition 19. For a prs S = {{xi, . . . ,Xn},{xn+i, ■ ■ ■ R) , we define 

S'idem = ({a^i, • ■ • , Xn}, {xn+ 1 , ■■■, x„}, q, R U { (xf , Xi) \ I < i < m }) to be the 
idempotent prs derived from S. 

In the case that S is simple (deterministic, acyclic, resp.), we will say that 
•S'idem IS an idempotent simple (deterministic, acyclic, resp.) prs. 

For a prs S = {f x\, ... ,Xn\, {xn+i, ■■■ ,Xm\,q,R) and a G hF, we write 
min eval(S', a) as a shorthand for min{p(a) | p G poly(S') } (analogously, we 
use maxEVAL(S', a)). 

Lemma 20. 7. For every Boolean circuit C, input x, and fc G N, there ex- 

ists a simple, deterministic and acyclic polynomial replacement system S, 
computable in logarithmic space, such that min EVAL(S'idem, (1, • • ■ j 1)) = 
#cC{x), and max eval (S' idem, (1, • ■ • , 1)) = #C{x). 

2. For every simple, deterministic, and acyclic prs Sidem there exists a Boolean 
circuit C , computable in logarithmic space, such that min eval (Sidem, (1, ■ • ■ , 
1)) = #cC{x), and max eval (S^ em, (1, • • ■ , 1)) = #C{x). 
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5 Complexity Results for Simple Replacement Systems 

5.1 Deterministic Systems 

In this section, we consider the complexity of the above defined sets for sim- 
ple replacement systems. Let us start with the complexity of fixed membership 
problems. 

Theorem 21. Let S he simple and deterministic. 

1. poly(S'), poly[S'] are V -complete. 

2. RANGe(S'), range[S'] S NP, and there are systems S such that the problems 
range(S') and RANGe[S'] are NP-complete. 

3. eval(S'), eval[S'] e TC°. 

Concerning variable membership problems of simple, deterministic systems, 
we obtain: 

Theorem 22. For simple and deterministic prs, 

1. poly(-) and poly[-] are in coRP and P-hard under 

2. range(-) and range[-] are ^P-complete under 

3. eval(-) and eval[-] are P-complete under <m®- 

5.2 Acyclic Systems 

For simple and acyclic systems S which are not necessarily deterministic, the 
sets poly(S') and poly[S'] are finite (by Lemma 15), hence the statement of 
Theorem 21 also holds in this case. Again, interesting questions arise when we 
examine variable membership problems. 

Theorem 23. For simple and acyclic prs, 

1. poly[-] is contained in MA and is NP-hard under 

2. range)-] and eval[-] are ^P-complete under 

Next, we turn to different variable membership problems for simple, acyclic 
systems under derivations. 

Stockmeyer and Meyer considered integer expressions (in our terminology, 
these are integer circuits with fan-out of non-input gates at most 1) where the 
only operations allowed are U and -I-. They proved that the membership problem 
in that case is NP-complete. It is easy to see that their result carries over to the 
case that we also allow multiplication, i. e., the problems N-MEMBER(U, -I-) and 
N-MEMBER(U, -I-, x) for expressions are NP-complete. 

The corresponding problems for circuits were not considered in their paper, 
but in later papers by Wagner [Wag84, Wag86] (under the name hierarchical 
descriptions). Only PSPACE as upper bound for membership is known from 
there, but recently it was shown by Ke Yang that both circuit problems are 
PSPACE-hard [YanOOj. 

Since (by Lemma 18), the member and range problems for these circuits 
are equivalent to the eval(-) and range(-) problems for simple acyclic prs, we 
conclude: 
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Theorem 24. For simple and acyclic prs and for all representations, 

1. poly(-) G EXPTIME, 

2. range(-),eval(-) areVSVkCF-complete. 

5.3 Idempotent Systems 

Again, since also here, poly(S') and poly[S'] are finite, we obtain results anal- 
ogous to Theorem 21. For the variable membership problems the following can 
be said: 

Theorem 25. For idempotent, simple, deterministic, and acyclic systems, we 
obtain poly(-), range(-), eval(-) g EXPTIME. 

Lemma 20 shows the importance of the minimization and maximization op- 
erations in the case of idempotent systems. We obtain from Theorem 1: 

Theorem 26. For idempotent, simple, deterministic, and acyclic replacement 
systems, 

1. the functions minEVAL(-) and minEVAL[-] are ffP-complete under 
reductions, 

2. the functions maxEVAL(-) and maxEVAL[-] are FP-complete under <m- 



Remark 21. For simple, deterministic and for simple, acyclic prs, the functions 
minEVAL(-), minEVAL[-],maxEVAL(-),maxEVAL[-] are FP-complete. 

6 Conclusion 

Our original motivation was the PC problem for circuits of restricted depth. 
Our proof (omitted in this proceedings version) shows that the problem is ffP- 
complete even for circuits of depth 4. For depth-2 circuits, the problem is easily 
seen to be in FP. The case of depth 3 remains open. 

The complexity of the sets RANGe(S'), range[S'] for fixed S is equivalent 
to determining the complexity of the range of a multivariate polynomial with 
nonnegative integer coefficients. While this is always an NP-problem, the proof 
of our result (again, unfortunately, omitted here) shows that there is a 4- variable 
polynomial of degree 6 whose range is NP complete. Can this be improved? 

A lot of interesting questions about prs remain open. To come back to some 
of the problems posed in Subsect. 1.1, we did not look at all at multivariate 
rules. Also, it seems worthwhile to examine if, besides idempotent systems, other 
prs families can be related to various types of arithmetic circuits and counting 
problems in Boolean circuits. 

Acknowledgment. We are grateful to Sven Kosub (Wurzburg) and Thomas 
Thierauf (Ulm) for helpful discussions. 




Arithmetic Circuits and Polynomial Replacement Systems 175 



References 



[AAB+99] 



[AAD97] 

[A1198] 

[BF91] 

[CMTV98] 

[IM83] 

[Kal88] 

[Mat93] 

[SM73] 

[Sud78] 

[Val79] 

[Ven92] 

[Vin91] 

[VSBR83] 

[VT89] 

[Wag84] 



[Wag86] 

[YanOO] 



E. Allender, A. Ambainis, D. Mix Barrington, S. Datta, and H. LeThanh. 
Bounded depth arithmetic circuits: Counting and closure. In Proceed- 
ings 26th International Colloquium on Automata, Languages and Pro- 
gramming, Lecture Notes in Computer Science, Berlin Heidelberg, 1999. 
Springer Verlag. To appear. 

M. Agrawal, E. Allender, and S. Datta. On TC°, AC°, and arithmetic 
circuits. In Proeeedings 12th Computational Complexity, pages 134-148. 
IEEE Computer Society, 1997. 

E. Allender. Making computation count: arithmetic circuits in the 
nineties. SIGACT News, 28(4):2-15, 1998. 

L. Babai and L. Fortnow. Arithmetization: a new method in structural 
complexity theory. Computational Complexity, 1:41-66, 1991. 

H. Caussinus, P. McKenzie, D. Therien, and H. Vollmer. Nondeterministic 
NC^ computation. Journal of Computer and System Sciences, 57:200-212, 
1998. 

O. Ibarra and S. Moran. Probabilistic algorithms for deciding equivalence 
of straight-line programs. Journal of the ACM, 30:217-228, 1983. 

E. Kaltofen. Greatest common divisors of polynomials given by straight- 
line programs. Journal of the ACM, 35:231-264, 1988. 

Y. V. Matiasjevic. Hilbert’s Tenth Problem. Foundations of Computing 
Series. MIT Press, Cambridge, MA, 1993. 

L. J. Stockmeyer and A. R. Meyer. Word problems requiring exponential 
time. In Proceedings 5th ACM Symposium on the Theory of Computing, 
pages 1-9. ACM Press, 1973. 

I. H. Sudborough. On the tape complexity of deterministic context-free 
languages. Journal of the Association for Computing Machinery, 25:405- 
414, 1978. 

L. G. Valiant. The complexity of enumeration and reliability problems. 
SIAM Journal of Computing, 8(3):411-421, 1979. 

H. Venkateswaran. Circuit definitions of non-deterministic complexity 
classes. SIAM Journal on Computing, 21:655-670, 1992. 

V. Vinay. Counting auxiliary pushdown automata and semi-unbounded 
arithmetic circuits. In Proceedings 6th Structure in Complexity Theory, 
pages 270-284. IEEE Computer Society Press, 1991. 

L. Valiant, S. Skyum, S. Berkowitz, and C. Rackoff. Fast parallel compu- 
tation of polynomials using few processors. SIAM Journal on Computing, 
12:641-644, 1983. 

H. Venkateswaran and M. Tompa. A new pebble game that characterizes 
parallel complexity classes. SIAM J. on Computing, 18:533-549, 1989. 

K. W. Wagner. The complexity of problems concerning graphs with reg- 
ularities. Technical report, Friedrich-Schiller-Universitat Jena, 1984. Ex- 
tended abstract in Proceedings 11th Mathematical Foundations of Com- 
puter Science, Lecture Notes in Computer Science 176, pages 544-552, 
1984. 

K. W. Wagner. The complexity of combinatorial problems with succinct 
input representation. Acta Informatica, 23:325-356, 1986. 

K. Yang. Integer circuit evaluation is PSPACE-complete. In Proceedings 
15th Computational Complexity Conference, pages 204-211. IEEE Com- 
puter Society Press, 2000. 




Depth-3 Arithmetic Circuits for S^{X) and 
Extensions of the Graham-Pollack Theorem 



Jaikumar Radhakrishnan^, Pranab Sen^, and Sundar Vishwanathan^ 

^ School of Technology and Computer Science, Tata Institute of Fundamental 
Research, Mumbai 400005, India 
{jaikumar, pranab}@tcs . tifr .res . in 

^ Department of Computer Science and Engineering, Indian Institute of Technology, 

Mumbai 400076, India 
sundarScse . iitb . ernet . in 



Abstract. We consider the problem of computing the second elemen- 
tary symmetric polynomial S„{X) = Ei<i<j<n using depth-three 

arithmetic circuits of the form where each Lij is a 

linear form. We consider this problem over several fields and determine 
exactly the number of multiplication gates required. The lower bounds 
are proved for inhomogeneous circuits where the Lij ’s are allowed to have 
constants; the upper bounds are proved in the homogeneous model. For 
reals and rationals the number of multiplication gates required is exactly 
n — 1; in most other cases, it is [f] • 

This problem is related to the Graham-Pollack theorem in algebraic 
graph theory. In particular, our results answer the following question 
of Babai and Frankl: what is the minimum number of complete bipartite 
graphs required to cover each edge of a complete graph an odd number 
of times? We show that for infinitely many n, the answer is [f ] • 



1 Introduction 

1.1 The Graham- Pollack Theorem 

Let Kn denote the complete graph on n vertices. By a decomposition of we 
mean a set |Gi, G 2 , . . . , G^} of subgraphs of such that 

1. Each Gi is a complete bipartite graph (on some subset of the vertex set of 
Kn); and 

2. Each edge of appears in precisely one of the Gi’s. 

It is easy to see that there is such a decomposition of the complete graph with 
n — 1 bipartite graphs. Graham and Pollack [4] showed that this is tight. 

Theorem. //{Gi, G 2 , . . . , G^} is a decomposition of Kn, then r > n—1. 

The original proof of this theorem, and other proofs discovered since then [3, 10, 
14], used algebraic reasoning in one form or another; no combinatorial proof of 
this fact is known. 
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One of the goals of this paper is to obtain extensions of this theorem. To 
better motivate the problems we study, we first present a proof of this theorem. 
This will also help us explain how algebraic reasoning enters the picture. Consider 
polynomials in variables X = ATi, X 2 , . . . , Xn with rational coefficients. Let 

Sl{X)^ Y, x,x,- 

l< 2 <j><n 

n 

i=l 

Then, we can reformulate the question as follows. What is the smallest r for 
which there exist sets Li, Ri C [n] {Li n = 0) for i = 1, 2, . . . , r, such that 

r 

= (1) 

i=i jeLi jeRi 

Notice that the two sums in the product on the right correspond to homogeneous 
linear forms. One may generalise this question, and ask: What is the smallest r 
for which there exist homogeneous linear forms Li{X), Ri{X) for i = 1,2 ... ,r, 
such that 

r 

SUX) = YW)MX)1 (2) 

i=l 

Graham and Pollack gave the following elegant argument to show that r must be 
at least n — 1. Observe that S^{X) = Xi)'^ — T^{X)]. Thus, (2) implies 

n r 

T^{X) = {Yx.r-2YL.{X)R,{X). (3) 

i=l i=l 

Now, if r is less than n— 1, then there exists a non-zero a = (oi, « 2 , ■ • ■ , ««) € Q” 
such that Li{a) = 0, for z = 1, 2 . . . , r, and «* = 0 (because at most rz — 1 
homogeneous equations in n variables always have a non-zero solution). Under 
this assignment to the variables, the right hand side of (3) is zero but the left 
hand side is not. 

With this introduction to the Graham-Pollack theorem and its proof, we are 
now ready to state the questions we consider in this paper. Observe that the 
lower bound above depended crucially on the field being (Q, and there are two 
main difficulties in generalising it to other fields. First, over fields of characteristic 
two, the relationship between S^{X) and T^{X) does not hold, for we cannot 
divide by 2. Second, even if we are not working over fields of characteristic 
two, T^{X) can vanish at some non-zero points. Equations similar to the ones 
considered above have been studied in the past in at least two different contexts 
viz. covering a complete graph by complete bipartite graphs such that each 
edge is covered an odd number of times (the odd cover problem) and depth-3 
arithmetic circuits for S^{X). 
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1.2 The Odd Cover Problem 

Suppose in the problem on graphs above, we drop the condition that the bipartite 
graphs be edge-disjoint, but instead ask for each edge of the complete graph to be 
covered an odd number of times. How many bipartite graphs are required in such 
a cover? This question was stated by Babai and Frankl [2], who also observed a 
lower bound of . Note that this problem is equivalent to considering (1) over 
the field GF(2). 

1.3 SnS Arithmetic Circuits 

By & SnS arithmetic circuit we mean an expression of the form 

En^b(^). (4) 

i=i j=i 

where each Ly is a (possibly inhomogeneous) linear form in variables X\, . . . , A„. 
Such ‘depth-three’ circuits play an important role in the study of arithmetic 
complexity [8, 9, 6, 13]. If each linear form Lij{X) is homogeneous (i.e. has 
constant term zero) then the circuit is said to be homogeneous, or else, it is 
said to be inhomogeneous. Although, depth-three circuits appear to be rather 
restrictive, these are the strongest model of circuits for which superpolynomial 
lower bounds are known; no such lower bounds are known at present for depth- 
four circuits. 

The fc-th elementary symmetric polynomial on n variables is defined by 

Sn{x)= E 

Tg(H)*eT 

Elementary symmetric polynomials are the most commonly studied candidates 
for showing lower bounds in arithmetic circuits. Nisan and Wigderson [9] showed 
that any homogeneous SUE circuit for computing S'^{X) has size I7((n/4A:)^). 
In their paper, they explicitly stated the method of partial derivatives (but see 
also Alon [1]). Although, a superpolynomial lower-bound was obtained in [9], 
the lower bound applied only to homogeneous circuits. Indeed, Ben-Or (see [9]) 
showed that any elementary symmetric polynomial can be computed by an in- 
homogeneous SUE formula of size O(n^). Thus inhomogeneous circuits are sig- 
nificantly more powerful than homogeneous circuits. Shpilka and Wigderson [13] 
(and later, Shpilka [12]) addressed this shortcoming of the Nisan-Wigderson re- 
sult and showed an f?(n^) lower bound on the size of inhomogeneous formulae 
computing certain elementary symmetric polynomials, thus showing that Ben- 
Or’s construction is optimal. To obtain their results they augmented the method 
of partial derivatives by an analysis of (affine) subspaces where elementary sym- 
metric polynomials vanish. Many of the lower bounds in this paper are inspired 
by the insights from these papers. All the results cited above work over fields 
of characteristic zero. At present no super-quadratic lower bounds are known 
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for computing some explicitly defined polynomial in the inhomogeneous model 
over infinite fields. Over finite fields the situation is better. Karpinski and Grig- 
oriev [5] showed an exponential lower bound for computing the determinant 
polynomial using (inhomogeneous) SUS circuits over any finite field. Grigoriev 
and Razborov [6] showed an exponential lower bound for any (inhomogeneous) 
SnS circuit computing a generalised majority function over any finite field. 
Thus, the elementary symmetric polynomials have been studied with consider- 
able success in the past in this arithmetic model of computation. 

Organisation of This Paper 

In the next section, we give a summary of our results. In Sect. 3, we present 
formal proofs of our upper bound results for GF(2). Section 4 contains formal 
proofs for our lower bound results for GF(2). A summary of the proof methods 
for our upper and lower bound results, as well as the proofs of the lemmas and 
theorems which have been omitted here, can be found in the full version [11] at 
http : //www. tcs . tif r . res . in/~pranab/papers/ s2n.ps. 



2 Our Results 

We study the computation of the symmetric polynomial S‘^{X) using SUS 
arithmetic circuits over several fields, with the aim of obtaining tight bounds 
on the number of multiplication gates required. Many of the techniques devel- 
oped earlier (in particular, the method of partial derivatives), in fact, give lower 
bounds on the number of multiplication gates. Unlike the previous results in 
arithmetic circuits, we will not be satisfied with obtaining bounds up to con- 
stant factors; instead, we shall try to get the exact answer, in the spirit of the 
Graham-Pollack theorem. 

As described above, computations of elementary symmetric polynomials have 
been considered for several kinds of SUIJ circuits. For the polynomial S'^(X), 
we have three different models. 

1. The graph model: This is the most restrictive model. Here the linear forms 
Li and Ri must correspond to bipartite graphs; that is, all coefficients must 
be 1 (or 0), no variable can appear in both Li and Ri (with coefficient 1), 
and no constant term is allowed in these linear forms. This is the setting for 
the Graham-Pollack theorem and its generalisations. 

2. The homogeneous model: Here the linear forms are required to be homoge- 
neous, that is, no constant term is allowed in them. However, any element 
from the field is allowed as a coefficient in the linear forms. This model was 
studied by Nisan and Wigderson [9], using the method of partial derivatives. 

3. The inhomogeneous model: This is the most general model; there is no re- 
striction on the coefficients or the constant term. 

We show our upper bounds in the graph and the homogeneous model; our 
lower bounds hold even in the stronger inhomogeneous model. We juxtapose our 
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results against the previously known results, highlighting our contribution. Note 
that the previous lower bounds were only for the homogeneous circuit model 
which were proved using the method of partial derivatives [9] (see also the rank 
arguments of Babai and Frankl [2]). The notation 3°°n used below stands for 
‘for infinitely many n 

2.1 Computing S^{X) over Finite Fields of Odd Characteristic 
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2.2 The Odd Cover Problem and Computing S'^(X) over GF(2) 
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2.3 Computing S^{X) over C 
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2.5 Computing S^{X) over IR and Q 
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3 Upper Bounds 

3.1 The Odd Cover Problem and Computing S^{X) over GF(2) 

In this section, we will show that there is an odd cover of K 2 n by n bipartite 
graphs whenever there exists a, n x n matrix satisfying certain properties. For 
this, we describe a scheme for producing an odd cover of K 2 n- 

We want to cover the edges of K 2 n with n bipartite graphs such that each edge 
is covered an odd number of times. Each complete bipartite graph is specified 
by specifying the two parts A and B. Partition the vertex set [2n] (of K 2 n) 
into ordered pairs (1, 2), (3, 4), ... , (2n— 1, 2n). In our construction, if the vertex 
2t — I of the pair (2i — l,2i) appears in part A of a bipartite graph, then 2i 
appears in part B] similarly, if 2i appears in part A, then 2z — 1 appears in part 
B. In particular, if one element of the pair does not participate in the bipartite 
graph, then the other element does not participate in it either. We shall call such 
a construction a pairs construction. 

Hence, to describe a bipartite graph, it suffices to specify for each pair (2z — 
l,2z), whether the pair participates in the bipartite graph, and when it does, 
whether 2z appears in part A or part B. The n bipartite graphs are specified 
using an zz X zz matrix M with entries in { — 1,0,1}. The rows of the matrix are 
indexed by pairs; the zth row is for the pair (2z — 1, 2z). The columns are indexed 
by bipartite graphs. If = 0, then the pair (2z — 1, 2z) does not participate in 
the jth bipartite graph; if = 1, then 2z appears in part B] if Mij = — 1, then 
2z appears in part A. We now identify properties of the matrix M that ensure 
that the bipartite graphs arising from it form an odd cover of K 2 n- 
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Definition 1. A matrix with entries from {—1,0, 1} is good if it satisfies the 
following conditions: 

1. In every row, the number of non- zero entries is odd. 

2. For every pair of distinct rows, the number of columns where they both have 
non- zero entries is congruent to 2 mod 4. 

3. Any two distinct rows are orthogonal over the integers. 

Lemma 1. If a n x n matrix is good, then the n complete bipartite graphs that 
arise from it form an odd cover of i^ 2 n- 

Proof. The proof appears in the full version of the paper [11]. □ 

Thus, to obtain odd covers, it is enough to construct good matrices. We now 
give two methods for constructing such matrices. 

Construction 1: Skewsymmetric Conference Matrices 

A Hadamard matrix Hn is an n x n matrix with entries in {—1, 1} such that 
Hnllf, = nl. A conference matrix C„ is an n x n matrix, with O’s on the diagonal 
and — 1,+1 elsewhere, such that C„Cj = {n — 1)1. The following fact can be 
verified easily. 

Lemma 2. n x n conference matrices, where n = 0 mod 4 are good matrices. 

Skewsymmetric conference matrices can be obtained from skew Hadamard ma- 
trices. A skew Hadamard matrix is defined as a Hadamard matrix that one gets 
by adding the identity matrix to a skewsymmetric conference matrix. Several 
constructions of skew Hadamard matrices can be found in [7, p. 247]. In partic- 
ular, the following theorem is proved there. 

Theorem 1. There is a skew Hadamard matrix of order n if n = 2*ki • • • ks, 
where n = 0 mod 4, each ki = 0 mod 4 and each ki is of the form p^ -\- 1, p an 
odd prime. 

Corollary 1. There is a good matrix of order n if n satisfies the conditions in 
the above theorem. Note that the conditions hold for infinitely many n. 

An easy construction of a skew Hadamard matrix for n = 2*,t > 1 is given 
in the full version of the paper [11]. 

Construction 2: Symmetric Designs 

The matrices we now construct are based on a well-known construction for 
symmetric designs. These matrices are not conference matrices; in fact, they 
have more than one zero in every row. 

Let g be a prime power congruent to 3 mod 4. Let IF = GF{q) be the finite 
field of q elements. Index the rows with the lines and the columns with points 
of the projective 2-space over F. That is, the projective points and lines are the 
one dimensional and two dimensional subspaces respectively, of F^. A projec- 
tive point is represented by a vector in F^ (out of g — 1 possible representatives) 
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in the one dimensional subspace corresponding to it. A projective line is also 
represented by a vector in IF^ (out of g — 1 possible representatives). The repre- 
sentative for a projective line can be thought of as a ‘normal vector’ to the two 
dimensional subspace corresponding to it. We associate with each projective line 
L a linear form on the vector space IF^, given by 

L{w) = v’^w, 

where w G IF^ and v is the representative for L. For a projective line L and a 
projective point Q, let L{Q) = L{w), where w is the representative for Q. 

Now, the matrix M is defined as follows. If L{Q) = 0, then we set Ml^q = 0; 
if L{Q) is a (non-zero) square in F, set Ml,q = 1; otherwise, set Ml^q = —1. 

The proof that M is a good matrix appears in the full version of the pa- 
per [11]. We thus have proved the following lemma. 

Lemma 3. Ifq = 3 mod 4 is a prime power then there is a good matrix of order 
+ q + 1. Note that infinitely many such q exist. 

We can now easily prove the following theorem and its corollary. 

Theorem 2. For infinitely many n = 0, 2 mod 4 we have an odd cover of 
using ^ complete bipartite graphs. 

Corollary 2. For infinitely many n = 1, 3 mod 4 we have an odd cover of 
using [^] complete bipartite graphs. 

We can also prove the following lemma. 

Lemma 4. If Sf{X),n = 0 mod 4, can be computed over GF(2) by a homoge- 
neous SnS circuit using ^ multiplication gates, then S'^_|_]^(Ai) can be computed 
over GF(2) by a homogeneous SIIS circuit using ^ multiplication gates. 

Proof. The proof appears in the full version of the paper [11]. □ 

From the above, we can now prove the following theorem. 

Theorem 3. For infinitely many n = 0, 2, 3 mod 4 we have homogeneous Eli S 
circuits computing Sf{X) over GF{2) using multiplication gates. For in- 
finitely many n = 1 mod 4 we can compute Sf{X) over GF(2) using homoge- 
neous EIIE circuits having [^J multiplication gates. 

4 Lower Bounds 

4.1 Preliminaries 

We first develop a framework for showing lower bounds for Sf{X) based on the 
method of substitution [13, 12]. Suppose that over a field F 

r Si 

sl{x) = Y,X{uAx), 

i=i i=l 



( 5 ) 
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where each is a linear form, not necessarily homogeneous. We wish to show 
that r must be large. Following the proof of the Graham-Pollack theorem that 
was sketched in the introduction, we could try to force some of the L^’s to zero 
by setting the variables to appropriate field elements. There are two difficulties 
with this plan. First, since the Lij’s are not necessarily homogeneous, we may not 
be able to set all of them to zero; we can do so if the linear forms have linearly 
independent homogeneous parts. The second difficulty arises from the nature 
of the underlying field: as observed earlier, S‘^{X) might vanish on non-trivial 
subspaces of F". 

In this section, our goal is to first show that if r is small, then S^{X) must be 
zero over a linear subspace of large dimension. A similar observation was used 
by Shpilka and Wigderson [13] and Shpilka [12]. Our second goal is to examine 
linear subspaces over which S‘^{X) is forced to be zero. We derive conditions on 
such subspaces, and relate them to the existence of a certain family of vectors. 
Later on, we will exploit these equations, based on the field in question, and 
derive our lower bounds for r. 

Goal 1: Obtaining the Subspace. 

Lemma 5. If S^{X) can he written in the form of (5) over a field F, then there 
exist homogeneous linear forms ii,£ 2 , ■ ■ ■ ,Ir in variables Xi,X 2 , . . . , Xn-r such 
that 

^2(Xi,A2,...,A„_„G,^2,...,G) = 0. 



Proof. The proof appears in the full version of the paper [11]. 



□ 



Goal 2: The Nature of the Subspace. Our goal now is to understand the 
algebraic structure of the coefficients that appear in the linear forms fi, ^ 2 , • • • , G 
promised by Lemma 5. Let £i = ^ let L be the rx{n — r) 

matrix (iij). Let Yi, I 2 , • • ■ j Yn-r € F'’ be the n — r columns of L. We will obtain 
conditions on the columns by computing the coefficients of monomials Xj for 
1 < j < n — r, and XiXj for 1 < i < j < n — r. For A| (1 < j < n — r), we 
obtain the following equation over F, 

r 

£mj + £mj£m'j = 0. (6) 

m—1 l<m<m'<r 

For monomials of the form XiXj (1 < i < j < n — r), we obtain the following 
equation over F, 

r r 

1 + £mi + £mj + {£mi£m'j + £m'i£mj) = 0. (7) 

m—1 m—1 l<m<m'<r 

For a positive integer m, let tm be the all I’s column vector and 0^ be the 
all O’s column vector of dimension m. Let Um be the m x m matrix with I’s 
above the diagonal and zero elsewhere. Let Jm be the mxm matrix with all I’s, 
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and let Im be the m x m identity matrix. Using this notation, we can rewrite 
(6) and (7) as follows: 

]l*Yj- -h YjUrYj = 0, for 1 < j < n - r; (8) 

l + t^Yi + t^Yj + Y^{Jr-Ir)Yj = 0, for 1 < i < j < n-r. (9) 

If the characteristic of the field is not two, we may rewrite (8) as 

211*Yj + r/(J^ = 0, forl<j<n-r; (10) 

With this, we are now ready to prove lower bounds. We will exploit (8), (9) 
and (if the characteristic is not 2) (10) to derive lower bounds for various fields. 

4.2 Lower Bounds for GF(2) 

Let Z stand for the integers. For Y G , let |F| be the number of odd 

components in Y . For Y, Y' G , let Y -Y' = be the dot product 

of Y and Y' over Z. 

Lemma 6. Suppose £i,...,£r are homogeneous linear forms in the variables 
. . . , ^ such that 

Sl{Xi,...,Xr,-r,£l,---Ar) = 0 

over GF{2). Then r > [§J . If n = 2, mod 4, then r > [§] . 

Proof. We use the arguments of Sect. 4.1. If there exist homogeneous linear forms 
£i,. . . ,£r over variables Xi, . . . ,Xn-r so that Sf^{Xi, . . . ,Xn-r,£h ■ ■ ■ ,£r) = 0 
over GF(2), we have (8) and (9). We treat the vectors Yj as elements of and 
the equalities as equalities over integers (mod 2). By counting the number of 
odd components (i.e. I’s) on the left and right hand side of (8), we obtain, for 
^ Y j Si n — r, 

=0 (mod 2), 

from which it follows that 

|yj |=0or3 (mod 4). (11) 

Since Yf{Jr — Ir)Yj = |li| |Yj| — • Yj over we conclude from (9) that, for 

l<i<j<n — r, 

m + \Y,\ + |U.| \Yj\ + y, • y, = 1 (mod 2), 

that is, 

y.-y,- = (i + |y,|)(i + |y,|) (mod 2 ). ( 12 ) 

Let W\, . . . ,Ws be the vectors among Y\, . . . ,Yn-r with \Yj\ odd, and let 
El, . . . , Et be the remaining t = n — r — s vectors, with \Yj\ even. 

Claim. If Yi, y 2 , . . . , are not linearly independent over GF(2), then t is 
odd and the only dependency over GF(2) among them is 

Proof. The proof appears in the full version of the paper [11]. □ 
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By the claim above, we see that there are at least n— r— 1 linearly independent 
vectors over GF(2) among the 1^’s. Since Yj G ^ we get r > n — r — 1 i.e. 

To obtain a better bound for n = 3 mod 4, we make better use of our equa- 
tions. So suppose n = 2r -I- 1 and n = 3 mod 4. We shall derive a contradiction. 

If n = 2r -I- 1, then n — r > r, and since Yj G , the vectors Yj are not 
linearly independent over GF(2). Then by the claim above, t is odd, X)l=i = 
Or mod 2, and W\, . . . ,WstEi, . . . , Et-\ are linearly independent over GF(2). 
Since, s-|-t— l = n — r— l = r, these vectors span (over GF(2)) the entire vector 
space GF(2)’’; in particular, is in their span: 

S t— 1 

a^Wi + y] PkEk = Hr (mod 2) 

i—1 k—1 

for some ai,(3k G ZZ. Taking dot products with Wi and Ek, we conclude (using 
(12)) that Oi = 1 mod 2 for 1 < i < s, and {Jt-i — It-i)P = Ot_i mod 2, where 
[3 G ZZ^~^ and the kth component of [3 is Pk- Since t is odd, Jt-i — It-i is full 
rank over GF(2) and (3 = Ot_i mod 2. Thus, 

S 

Wi = llr (mod 2). 

i=l 

It is easy to verify that for all integer vectors Y, 

\Y\ = Y-Y (mod 4) (13) 

Thus, (X^Li Wi) ■ (Ei=i Wi) = I Y.i=i Wi\=r mod 4, that is 

S 

J2Wi-Wi + 2 W^■Wj=r (mod 4). 

2=1 

By (11) and (13), Wi ■ Wi = i mod 4, and by (12), Wi -Wj=0 mod 2 for i yf j. 
Thus, 

3s = r (mod 4). (14) 

Similarly, starting with X)l=i mod 2, we obtain t{t — 1) = 0 mod 4; 

since t is odd, t = 1 mod 4. But then, using (14), 

n = r-|-s-|-t = 3s-|-s-|-l = 1 (mod 4) 



which is a contradiction. 

Since r > holds for all 



n, we have shown that if n = 3 mod 4, then 

□ 



Using lemma 5 and the above lemma, we can now prove the following theo- 
rem. 



Theorem 4. Any (inhomogeneous) SUE circuit computing S'^(Xi, . . . , X„) 
over GF{2) requires at least multiplication gates if n = 0,2, 3 mod 4 and 
at least multiplication gates if n = 1 mod 4. 
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Abstract. Here we deal with the question of definability of infinite 
graphs up to isomorphism by weak monadic second-order formulae. In 
this respect, we prove that the quantifier alternation bounded hierarchy 
of equational graphs is infinite. Two proofs are given: the first one is based 
on the Ehrenfeucht-Fraisse games; the second one uses the arithmetical 
hierarchy. Next, we give a new proof of the Thomas’s result according to 
which the bounded hierarchy of the weak monadic second-order logic of 
the complete binary tree is infinite. 



Introduction 

Logic is by now a classical mean in theoretical computer science to describe 
complexity issues; it has been used in many areas, for instance effective com- 
putability (cf. [20]), descriptive complexity (cf. [7,15]), or else formal language 
theory (cf. [16]). 

This paper deals with the question of definability of infinite graphs up to 
isomorphism by logical formulae. The graphs which are studied here are equa- 
tional graphs which have been introduced in [4] as models of program schemes. 
Such a graph is the inductive limit of a sequence of finite graphs generated by a 
deterministic hyperedge replacement grammar (cf. [5,21]). They extend strictly 
context-free graphs which have been introduced in [14] as configuration graphs 
of pushdown automata. Note that these kinds of graphs generalize the concept 
of regular trees; such a tree is defined as the tree of all the runs of a finite state 
automaton (see [3]). 

We deal with monadic second-order formulce on graphs (MS-formulae for 
short) which are the logical formulae which deal with graphs as relational logical 
structures using individual and set variables ranging over vertices and edges. 
We consider the weak monadic second-order logic (WMS logic for short) which 
consists in interpreting the MS-formulae by considering set variables as ranging 
over finite sets of vertices and edges. WMS logic is a classical extention of the 
first-order logic. It is a variation of the well-known monadic second-order logic 
(MS-logic for short) (cf. [21,8]). The monadic second-order logic is related to 
the concept of equational graphs because of the two following results: first, one 
can decide in an effective way whether a given equational graph satisfies a given 
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MS-formula according to MS-logic, which is also true for WMS-logic (cf. [4], for 
generalizations cf. [21,2]). Second, the equational graphs are exactly the graphs 
of bounded tree- width (cf. [17]) which are definable up to isomorphism by MS- 
formulae (cf. [5]) according to MS-logic. Note that these results generalise the 
fundamental results of [1] and [18,19] where MS and WMS were studied in the 
contexts of infinite words and infinite trees. 

The present work is motivated by the conjecture of [5] according to which 
any equational graph is definable up to isomorphism by a formula of WMS-logic. 
This is true if one considers MS-logic (cf. [5]); let us note that this implies that 
the isomorphism problem for equational graphs is decidable. Concerning WMS- 
logic, some steps have been ever raised: the conjecture has been firstly proved 
to be true for context-free graphs (cf. [22]), and then for the equational graphs 
which have covering trees of finite degree (cf. [23]). 

Here, we investigate equational WMS-definable graphs through the study of 
the quantifier alternation hounded hierarchy. A graph is said to be in the n- 
th level of the bounded WMS-hierarchy if there exists a WMS-formula which 
defines it up to isomorphism and which has n — 1 alternations of existential and 
universal unbounded quantifiers. One says that the hierarchy is infinite if for 
each integer n, the n-th level is strictly included in the n + 1-th one. The main 
result of this paper is the following: 

Theorem 1. The hounded WMS-hierarchy of equational graphs is infinite. 

In order to see how this theorem fits in existing works, let us now mention 
some results about hierarchies relating to monadic second-order logical systems. 
Firstly, the bounded MS and WMS-hierachies of languages of infinite words, i.e. 
the one successor theory, are finite, which follows from [1] and [12]. Next, the 
bounded MS-hierarchy of languages of infinite binary trees, i.e. the two succes- 
sors theory (MS2S for short) is also finite, which follows from Rabin’s Theorem 
(cf. [18], see also e.g. [9] for further results), while the weak one (the bounded 
WMS2S-hierarchy) is infinite (cf. [24], see also [13]). Concerning definability of 
graphs up to isomorphism, it follows from the results of [5] that the bounded 
MS-hierarchy of equational graphs is finite. Theorem 1 shows that the situation 
is different when considering WMS-logic, even though it follows from [22] that 
the bounded WMS-hierarchy of context-free graphs is finite and stops at most 
at the fifth level. 

As we mentioned above the best result concerning the conjecture of WMS- 
definability of equational graphs has been raised in [23] . The WMS-formulae which 
are constructed in this work have unbounded numbers of quantifier alternations; 
Theorem 1 shows that one can not get away from this fact: equational graphs are 
hard to define up to isomorphism by using weak monadic second-order formulae. 

We give two proofs of Theorem 1. The first one is based on an extention to 
WMS-logic of the classical technique of Ehrenfeucht-Fraisse games [8,10]). The 
method is similar to the well-known construction used to show that FO{LFP) 
is strictly more expressive than FO{TC) (see e.g. [6]). The second one is based 
on infinity of arithmetical hierarchy (see [20]). It allows us to recover the fact 
that WMS2S-hierarchy is infinite, which is deserved in [24] where it was proved 
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using arithmetical hierarchy as well together with Rabin’s Theorem (cf. [18]). 
The first proof is in some sense stronger than the second one because the graphs 
which are constructed in it have covering trees of finite degree, which is not true 
for the second proof. But in other respects, this last one allows us to recover 
the result of [24] without using Rabin’s Theorem, which does not seem to be 
possible with the game-based proof. 

Acknowledgements. The author is greatly indebted to his supervisor G. 
Senizergues for having spent much time in very helpful discussions. He also wants 
to thank the reviewers for their work which led to important improvements of 
this paper. 

1 Preliminary 

For an introduction to monadic second-order logical systems, the reader is refered 
to [6,8], cf. also [21, chap. 5]. 

We deal with labelled directed multi-graphs (graphs for short), i.e. tuple 
(V, E, vert, A, lab) where V is an at most countable set whose elements are the 
vertices; E is the set of edges; vert : E ^ V x R is the map defining the target 
and the origin of each edge. Vertices and edges are labelled by elements of A 
according to the map lab : V U E ^ A. Such a graph is represented by a logical 
relational structure {D, (Ha)aeAj inc) where D = V U E; ior all a G A, Da C D 
is a unary relation on D defining elements labelled by a and inc is a ternary 
relation defining incidence in the sense that (x, y, z) G inc if and only ii x G E 
is an edge of origin y G V and target z G V. In order to simplify notations, 
the relational structure associated to a graph G is still denoted by G. We con- 
sider monadic second-order formulce on graphs. These formulae are constructed 
using individual variables (usually denoted by latin letters x, y, z,...) and set 
variables (usually denoted by greek letters a, /3, ...). Atomic formulae are of the 
forms X G a or a C /3 or inc{x, y,z) or else x G Da with a G A. Syntax is not 
restricted: we allows existential and universal quantifiers over individual and set 
variables, conjunctions and negations. The semantics of such formulae differs in 
the monadic second-order logic and in the weak monadic second-order logic: in 
the former one set variables range over all the subsets of the domain, while in 
the last one they range over finite subsets only. 

Remark 1. What we call here MS-logic of graphs is usually denoted by MS 2 - 
logic in order to distinguish it from MSi-logic in which quantifications are done 
over vertex sets only (cf. e.g. [21]). MSi-logic deals with simple graphs; such a 
graph is encoded by a relational structure whose domain is its set of vertices 
only, instead of the union of its set of vertices and its set of edges as we consider 
here. MS 2 is more expressive than MSi. This is why we have chosen to prove 
Theorem 1 for MS 2 . As we shall see, it is true for MSi as well (see Remark 3). 

A formula is said to be in prenex form if it is of the form QiXi...QnXnif 
where for each i, Qi G {3,V}, the Aj’s are variables and ip does not contain any 
quantifier. Any formula can be put in prenex form in an effective way (cf. e.g. [8]). 
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Let a and f3 be some set variables; let (p{a, /?) be a formula with a and j3 
as free variables. The quantifier over a in the formula 3 a (3) (Va ^(a, (3) 
respectively) is said to be hounded if this last formula is equivalent to 3 a (a C 
(3 A (p(a,/ 3 )) (Va (a C /3 ^ (p{a,P)) respectively). In this case it is denoted by 
3 a C /? (y a G f3 <f{a) respectively). A formula in which all the quantifiers 
are bounded is said to be bounded. As stated by next Lemma, bounded quan- 
tifications can always be put after the unbounded ones (cf. [ 13 ] for a proof). 



Lemma 1. Let il}{a,l3,G) he a formula, then we have: 

• 3 a C py (3,j) = V7 3aC/3V7C7 

• 3 a C P3^ip{a,P,G) = 373a C / 3 ' 0 (a,/ 3 , 7) 

• Va C py Gpla, P,g) =y^y a C Pp{a,P,G) 

• Va C /33 7'i/)(a,/3, 7) = 37 VaC/337C7 'ip{a,P,'j) 

A formula is called a A„-formula (a 77 „-formula respectively) if it has the 
form 3 Ai VA2 ... V'(Ai, A2, ..., A„) (VAi 3A2 ... V'(Ai, A2, ..., A„) 
respectively) where p is bounded. The above lemma implies that any formula is 
equivalent to a A„-formula (a 7 T„-formula respectively) for a suitable n. 

This classification of formulae provides a classification of definable graph prop- 
erties, i.e. the properties which can be expressed by logical formulae. What we 
call here a graph property is formally a class of graphs, for instance the class of 
all the connected graphs. Let respectively) denote the set of families of 

graphs which are WMS-definable by some L 7 „-formulae (by 7 T„-formulae respec- 
tively). Note that for all n > 0 : U C n . This classification 

of WMS-definable graph families, i.e. the sequence is called the hounded 
weak monadic quantifier alternation hierarchy. 

2 Ehrenfeucht-Fraisse Games 

2.1 Ehrenfeucht-Fraisse Games for Weak Monadic Second-Order 
Logic 

Let G = {VG,Ea,vertG,A,lahG) and G' = {VG>,EG',vertG',A,lahG') be two 
A-labelled graphs. Let d and a' be some finite parts of G and G' respectively, 
i.e. finite subsets of vertices and edges. We call partial isomorphism between 
d and a' a one-to-one mapping ct : d ^ d' which preserves the adjacency, i.e 
y a, x,y G a : vertG{a) = (x,y) if and only if vertG'{o’{a)) = (cr(x), a(y)), and 
the labels, i.e. V a G d : lah(a) = lah(cr(a)). We shall identify such a mapping 
with its graph, i.e. the subset of {Vg U Eg) x {Vg' U Eg') which encodes it. If 
cr and cr' are two partial isomorphisms, we say that cr extends o' \i a' G a as 
subsets of {Vg U Eg) x {Vg> U EgP- The set of partial isomorphisms between two 
finite parts d and a' is denoted by PartIsom(d, d'). 

Let A and B be two players, a session in the Ehrenfeucht-Fraisse game as- 
sociated to the pair of graphs (G, G') goes on as follows : at the first round, 
A chooses a finite part di of G, and then B replies with a finite part dj of G' to- 
gether with a partial isomorphism a\ G PartIsom(di, a{). At the second round. 
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A chooses a finite part a '2 of G' and B replies with a finite part 0.2 of G together 
with a partial isomorphism (T 2 G PartIsom(di U 0 : 2 , di U which extends (Ti; 
and so on, A chooses finite parts alternatively in G and G' and B extends the 
isomorphism as A goes along. The game stops when B can not find a good an- 
swer. We say that B has a strategy of order n if he can play the first n rounds 
whatever choices A makes. 

The following results show the classical link between this game view point 
and logic. 

Lemma 2. Let G and G' he two graphs and let ..,a„) be a bounded for- 
mula. Let di,...,d„ (a'l, respectively) he some finite parts of G (G' re- 

spectively) and let a G PartIsom(lJ di, |J d') be a partial isomorphism which 
exchanges cti and d' for all i, then we have (G, di, .., d„) ^ c^n) */ 

only z/(G',di,...,d'„) |= if{ai, ...,an) 



Lemma 3. Let G and G' he two graphs such that B has a strategy of order n in 
the Ehrenfeucht-Fraisse game associated to (G, G'); then for all En-formula ip, 
G \= p implies that G' \= p. 



2.2 First Proof of Theorem 1 

This section is devoted to the proof of Theorem 1. 

We begin by defining two sequences of graphs (G„)„>i and (G^)„>i such 
that for all n > 1, G„ and G'„ are not isomorphic and B has a strategy of order 
n in the Ehrenfeucht-Fraisse game associated to (G„, G^). 

Let us set V = (Z)*, i.e. the set of finite sequences of integers, and E = 
(Z)+x {r, t}, where denotes the set of finite non empty sequences of integers, 

r and t are two symbols (r like radial and t like transversal); let us consider the 
mapping vert : E —> V x V defined as following: 

~ Ve = ((xi, ..,xi),r) G E, vert{z) = ((a;i, ..,x/_i), {xi, ..,xi-i,xi)), 

- Ve = ((si, ..,xi),t) G E, vert{z) = ((a;i, ..,xi), (xi,..,xi -\- 1)). 

Let A be a set; let Z{A) denote the set of A-labelled graphs whose vertex 
set, edge set and edge mapping respectively are V, E and vert. Let G € Z{A) 
and zq G V, we denote by G(zq) the graph of Z(A) defined hy : V z G V U E : 
labo(zo)(^) = lO'ba(zo.z), where zq.z denotes the concatenation of zq and 2 if 
z G V, and by abuse of notation (zo.u,r) or (zq.uA) if z = (u,r) G E or 
z = (u,t) G E respectively. 

G„ and G'^ are now defined as elements of Z({0, 1}). First, all edges are 
labelled by 0: for all n > 1 and for all e G E, let laba„(e) = laba>^{e) = 0. The 
labels of vertices are defined inductively as following: 

- For all z G V, let laboi (z) = 0 

- F» *11 




The Bounded Weak Monadic Quantifier Alternation Hierarchy 193 



— For n>2,Gn is defined from G'^_i as following: laba„{Q) = 0 and Vz G F 
such that |z| = 1 : G„(z) = G'^_i. 

— For n>2,G'n is defined from G„_i and G^_^ as following: laba„{Q) = 0, 
Vz e y \ {(0)} such that |z| = 1 : G„(z) = G'^_i and G„((0)) = G„_i. 



G ! 

n--[ 



\ i \ i \ i 
— 




o 



G 



n 



G ! ^1 

n — 1 — 1 — 1 



\ i /' \ i /' \ i /' 

— ~o -o -o 




o 



G' 



Lemma 4. For all n > 1, B has a strategy of order n in the Ehrenfeucht-Fraisse 
game associated to (G„,G^). 

Sketch of proof : (induction on n) 

• Case n = 1: Let di C Vgi U Eq^ be the choice of A in the first round of the 
game. All the elements of di are labelled by 0 and the unique element of Vg[ GEg'^ 
labelled by 1 is (0) G Vg^ so B performs a shifting on the left of m = min{a;i G 
Z I 3x2, ■■,xi such that (xi, ..,x/) G a} first level vertices; more precisely he uses 
the one-to-one mapping Am : V ^ V defined by Am((a:i, x;)) = (xi —m + 
1, X/) which extends in a natural way to H U if to define an automorphism of 
{V,E,vert) which shall be still denoted by A™; B then chooses a'l = A(di) and 

Am \ai ■ 

• Case n > 1: Suppose now that B has a strategy C of order n — 1 relatively to 
(G„_i, G(j_i); we will then define a strategy of order n relatively to (G„,G(j). 
The first round proceeds exactely like in the case n = 1; if di C Vg„ U Eg„ is 
the first choice of A , we define m, Am, and u\ as above. 

Let us consider d, C Vc U Eg’ the second choice of A which we divide into two 
parts oi 2 = {(xi, .., Xi) G | xi yf 0} and d^ = {(xi, .., x/) G d^ | xi = 0}. For 
a '2 B uses the above shifting Am : let a\ = and = Am^l^a U ai. For 

a 2 , let us remark that G(j((0)) = G„_i and G„((m— 1)) = G'„_^, so d^^ induces 
a finite part of G„_i which can be considered as the first choice of A in the game 
associated to (Gn~i,G'„_i) then, ( gives a answer, i.e. a finite part d| C G'^_i 
and a partial isomorphism which induce a finite part d| of the subgraph of G„ 
of root (m — 1) (the subset of G„ of words of which (to — 1) is a prefix) and a 
partial isomorphism (t| G PartIsom(d 2 , a'^)- 
Finally d^ U d| and U (t| is a correct answer of B . 

For the succeeding rounds, B uses in the same way the strategy C in the 
game relative to the pair (G(j((0)), G„((to — 1))) and Am in the rest. Since f the 
strategy of B of order n — 1, intervenes in the second and later rounds, the above 
method gives a strategy of order n. □ 

Since G„ and G'^ are not isomorphic, the preceding lemma together with 
Lemma 3 proves the next result: 
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Lemma 5. For all n > 1, G„ is not definable up to isomorphism by a 
formula. 

We have now to see that G„ is indeed WMS-definable up to isomorphism, 
which is stated by next Lemma. 

Lemma 6. There exists a MS-formula such that for any graph G, G \= <Fn 
according to WMS-logic if and only if G is isomorphic to Gn,’ moreover, there 
exists N > 1 such that for all n > N , <Pn can be constructed to be Sn+i- 

Because of the lack of space, Proof is omited (cf. [11]). 

Hence, for all n > N, the isomorphism class of G„ belongs to 
which proves that the bounded weak monadic quantifier alternation hierarchy is 
infinite. To conclude the proof of Theorem 1, it remains to show that the bounded 
hierarchy is still infinite when restricted to isomorphism classes of equational 
graphs. This is true seeing that the G„’s actually are equational. Equational 
graphs can be seen as canonical solutions of systems of graph equations (cf. 
[5]). The lack of space makes impossible to construct in details such systems for 
the G„’s. Nevertheless, we give the main ideas: first, we have to note that the 
graph which is made of one vertex of infinite degree which is connected to all the 
vertices of an infinite linear graph is equational; let us denote it by G. Then, the 
equations defining Gi say that it is obtained by gluing one copy of itself on each 
vertex of G except the root, i.e. the vertex of G of infinite degree. On the other 
hand, G[ is obviously equational as it is obtained from Gi by modifying the label 
of only one vertex. Then, the equations defining G„ and G'„ are constructed by 
induction from those defining G and those defining G„_i and G'.^_i by following 
the definition scheme given above. For instance, G„ is obtained by gluing a copy 
of G(j_^ on each vertex of G except the root (cf. [11] for more details). 

Let us note that instead of constructing directly such systems of graph equa- 
tions, one can notice that the G„’s are of bounded tree width (cf. [17]) and 
WMS-definable, as we saw in Lemma 6. In view of the results of [5], this implies 
that they are equational. 

Remark 2. Transversal edges of G„ are useless in the proof of Theorem 1 . How- 
ever, they give the existence of a covering tree of finite degree, which, to some ex- 
tend, is meaningful because of the following: the equational graphs with covering 
trees of finite degree have been proved to be WMS-definable up to isomorphism 
(cf. [23]); but the numbers of quantifier alternations of the formulae which are 
constructed in this proof are not bounded. We have thus shown that one can 
not get away from this fact. 

Remark 3. Let us note that MS i -logic (see Remark 1) gives rise to another 
bounded quantifier alternation hierarchy. In this respect, it turns out that our 
construction also shows that this hierarchy is also infinite. First, one can verify 
that a MSi-formula can be translated into an MS 2 -formula, adding at most one 
quantifier alternation. Therefore, in view of Lemma 5, G„ can not be defined up 
to isomorphism by a MSi-formula with less than n — 1 quantifier alternation. 
Second, one shows that G„ is MS i -definable. 
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Remark 4- The bounded weak monadic quantifier alternation hierarchy differs 
from the weak monadic quantifier alternation hierarchy (weak monadic hierarchy 
for short) which is defined in an analogous way by considering first-order formuliE 
instead of bounded formulae. In this respect, one shows that there exists a fixed 
integer k such that for all n, the isomorphism class of G„ belongs to the k-th 
level of the weak monadic hierarchy. Indeed, the formula given in Lemma 6 
actually consists in the conjunction of a fixed weak monadic formula which 
defines the family Z({0, 1}), and a first-order formula (pn which checks the labels 
of Gn- So, the level of <Pn in the weak monadic hierarchy is the one of which is 
fixed. We hence obtain that the fc-th level of the weak monadic hierarchy contains 
instances beyond any given level of the bounded weak monadic hierarchy. 

3 Arithmetical Hierarchy and Graph Hierarchy 

3.1 Arithmetical Hierarchy 

For basics about effective computability, the reader is refered to [20]. 

Let T* : (N)* ^ N be a Godel numbering of finite integer sequences, i.e. a 
bi-recursive one-one mapping (cf. [20, p. 70] for such a construction); as usually, 
T*((a;i, ..., Xfc)) shall be sometimes denoted by < xi,...,Xk >■ 

Let {M^)i>o be a Godel numbering of the set oracle Turing machines. For 
any A C N and z > 0, let ff^ : W/^ C N ^ {0, 1} be the partial function 
computed by with A as oracle, where = {a; G N| stops on the 
instance x using A as oracle } and Vx G ff'{x) = 0 if and only if give 
0 on the instance x using A as oracle. Glassical Turing machines are identified 
with oracle machines with 0 as oracle. For all k, /® and IF® shall be denoted by 
fk and IFfe respectively; fk is called the k-th recursive partial function. Let Rec 
denote the set of all the recursive functions. 

Let be the set of subsets of N of the form {A: | 3/ciVfc2---v^n /io(< 

ki,k 2 , ...,kn,k >) = 1} for any fixed zq. We also consider 77“’’**^ which denotes 
the set of subsets of N of the form {k \ '^ki3k2...'^kn /io(< ki,k 2 , ..., kn, fc >) = 1} 
for any fixed zq. |J„ is called the arithmetical hierarchy. 

Let us consider the jump operation which associates to any subset A C N the 
set A' = {a; G N| a; G W^}. For all n > 1, let A” = (A"-i)' where A® = A. We 
then consider the sequence (0")n>o> which is called the sequence of jumps. 

Let A and B be two subsets of N, let us recall that B is said to be recursive 
in A if and only if its characteristic function is equal to f^ for some k. B is said 
to be recursively enumerable in A if and only if there exists fc G N such that 
B = W^. 

The proofs of the two following results can be found in [20] . 

Lemma 7. For all n> 1, B G if and only if B is recursively enumerable 

in 0”. 

The next result implies that the arithmetical hierarchy is infinite, i.e. for each 

^ S^arith r~ y^arith 
^ ^n+1 • 

Lemma 8. For all n > 0, 0”+^ is not recursive in 0”. 
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3.2 Second Proof of Theorem 1 



The first part of this second proof consists in constructing a sequence 
of recursive trees such that 0" is reducible to the problem of being isomorphic 
to tn- The second argument is that the problem of determining whether a recur- 
sive tree satisfies a i7„-formula according to weak monadic second-order logic is 
recursive in 0" (see Lemma 10 bellow) . Seeing that is constructed in order to 
be equational and WMS-definable up to isomorphism, Theorem 1 follows. 

Let us make precise what we shall call a recursive tree. Let T°° be the set of 
trees of the form T = (Tr, Et, Edgr, {0, 1}, labr) where 

— Vt C (N)* is prefix closed, i.e. x GVt and y <pref x implies y € Vr; 

- Et is a copy of Vr\{()}; 

- For a: = (xi, ..., x„) G Et : EdgT{x) = ((a;i, ..., x„_i), (xi, ..., x„)); 

— labT '■ Vt U Et — > {0, 1} and labTlsT = 0- 

For T G T°° and xq G Vt, let T{xq) be the subtree of T of root a;o: T{xq) G T°°, 
Vt(xo) = {x & (N)* |a;o.a; G Vt} and Va; G Vt(xp), labT{xo){x) = labT{xo.x). 
Let us consider for any partial function / : Dom(/) C N ^ {0, 1} the element 
Tr(/) of T°° defined as following: is the greater subset of Dom(/ o r*) = 

r*“^(Dom(/)) which is prefix-closed; and for all x G “ 

/or* (x). 

Definition 1 (Recnrsive Trees). A recursive tree is an element of T°° of 
the form Tr{fi) for some i where fi denotes the i-th recursive partial function. 
Let Ti = Tr{fi) denote the i-th recursive tree. 

For each tree t G T°° let us consider the set of integer Reclsom(t) = {i G N | 
is isomorphic to t}. 

Let us turn to the construction of (tn)n>i- We need for that an auxilliary 
sequence {t'^)n>i of trees. First, for all n G N, Vt„ = Vf^ = (N)*; then 



^ 1 1 , \ f 1 if z = (z) for any even integer i 

— labt' = 0 

‘-1 

— For n > 1, tn+i is defined by: (()) = 0 and Wx G Vt„+i such i 

\x\ = 1: tn+i{x) = tn ii X = (z) with z even and tn+i{x) = t„ otherwise. 

— For n > 1, is defined by: labt'^^^{{)) = 0 and Va; G Vp^^_^ such i 
|a;| = 1 : 4+i(a;) = tn. 



Lemma 9. For all n > 1, 0” is recursive in RecIsom(t„). 

Proof. Let B{n,i) C N be the z-th set of the rz-th level of the arithmetical 
hierarchy, i.e. {fc | 3fciVA:2...3fc„ : /z(< k\,k2, ...,kn,k >) = 1}. Let us consider 
the following induction hypothesis : 

HR„.- there exists a computable recursive function such that Vz, A: G N .• Pn{< 
i,k >) G RecIsom(t„) U Reclsom(t)j) and Pn{< i,k >) G RecIsom(t„) iff k G 
B{n,i). In other words, for all n there is an algorithm which associates to any 
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pair z, fc e N an algorithm which computes a tree which is isomorphic to tn or 
t'„ and is isomorphic to tn if and only if k G B(ji,i). 

It follows from Lemma 7 that there exists an index z„ such that 0” = B{n, in)- 
Therefore, HR„ implies that 0" is Turing reducible to RecIsom(t„) by the func- 
tion pn '■ k ^ p{< in, k >). This implies the lemma. 

• Proof of HRi: Here we describe the algorithm which computes Tpi(<i,fc>): 



Instance : x G (N)* 

if |a;| yf I then labr(x) = 0 

else let xi,a :2 G N be such that x = (< xi,X 2 >) 
if a;i = 0 then labr(x) = 0 



else labrix) 



1 if Mi stops and not give 0 

before the x^th calculus steps 
on < a;i — 1, A: > 

0 otherwise 



pi(< i,k >) is then defined to be the index of the tree which is described by this 
algorithm. Because of the lack of space, we shall omit the proof that pi indeed 
satisfies HRi. 

• Let us suppose that HR„ is true. 

We begin the proof of the induction step with some preliminaries: Let i,k G N; 
we have B{n -I- l,z) = {k \ 3ki'dk2---'^kn+i ■ M< k„k2,..., ^n+1 j k >) = 11- 
Let us consider the integer set B{n + l,z) = {< k,ki >| yk 2 ---'^kn+i ■ M< 
ki, k 2 , ■.., kn+i, k >) = 1}. Note that B{n -I- 1, z) = {fc | 3k\ :< k, k\ >G B{n + 
1, z). This is the complement of an integer set B{n,S{i,n)) which belongs to the 
n-th level of the hierarchy; note that 6(i,n) is computable. Therefore, it follows 
from HR„ that < k, k\ >G B(zz -|- 1, z) if and only if p„(< 5{i, n),< k, k\ ») G 
Reclsom(t^). 

Let us now describe the algorithm computing Tp„_,.i(<yfc>) • 

Instance : x = {x\,...,xi) G (N)* 

if X = () then labr- = 0 

Let a;ii,a:i 2 G N be such that x\ =< xii,x \2 > 

if a;ii =0 then (x) = labt^{{x 2 , -,xi)) (i) 

else labT^^^^^^, ,^^{x) = ((a:2, a^i)) (ii) 



First, (i) guarantees that there are infinitely many sub-trees of level 1 
which are isomorphic to 

Second, (ii) guarantees by induction that each sub-tree of level 1 is isomorphic 
to tn or to One verifies that if there is at least one sub-tree of level 1 which 
is isomorphic to t'n then there are infinitely many one. 

Now <i,k>) is isomorphic to if and only if at least one of its sub- 

trees of level 1 is isomorphic to t'n- This is true if and only if there exists xn yf 0 
such that Pn(< S{i,n),< k,xu — 1 >>) G Isom{t'n), which is equivalent, by 
using preliminaries, to k G B{n -G 1, z). 



Because of the lack of space, we state the next lemma without proof. 
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Lemma 10. The problem of determining whether a recursive tree satisfies a 
Sn-formula according to weak monadic second-order logic is recursive in 0". 

It follows from the previous lemma that whatever n > 1 is, there is no Un- 
formula which defines t„+i up to isomorphism. Indeed, the existence of such a 
formula would be a contradiction with Lemma 8. On the other hand, like in sec- 
tion 2.2, tn is equational and it is WMS-definable. We shall omit to prove that it 
is WMS-definable. The construction of some systems of graph equations which 
define the t„’s can be performed by using the ideas of section 2.2. First, transver- 
sal edges are no longer considered here. And even if the definition schemes of 
the tnS and the t^’s are slightly different from the ones of the G„’s and the 
G^’s, they mainly follow the same idea. And one can verify that the idea for 
the construction of the systems which define the G„’s applies here as well to 
construct some systems defining the t„’s. Theorem 1 then follows. 



3.3 Thomas Theorem 

Here we deal with labelled complete binary trees, i.e. mapping t : {I, r}* {0, 1}; 

their set is denoted by Ti„y({0, 1}). In the context of the weak monadic second- 
order logic of the binary tree (WMS2S for short) , the usual concept of bounded 
quantifier is different: following [24] and [13], bounded quantifiers are indeed 
those of the form 3a <pref P--- or Va <pref P--- where <pref denote the prefix 
ordering of {I, r}*. This defines an other concept of bounded hierarchy. However, 
one verifies that levels are the same. 

In [24], it is proved that the WMS2S bounded hierarchy is infinite, i.e. The- 
orem 2 below; the proof involves infiniteness of arithmetical hierarchy together 
with Rabin’s theorem (cf. [18]). By using the tools introduced in the preceding 
section, we give an alternative proof of this result which does not use Rabin’s 
theorem. 

Theorem 2 (Thomas 82). The bounded WMS2S hierarchy is infinite. 

Proof. Let A : {I, r}* ^ N* be the mapping defined as follows: for any xi, .., xi+i, 
yl(PirP2r..P‘rP‘+i) = (xi,..,xi). A allows us to consider a mapping from 
Tin/({0, 1}) to T°°, which shall be still denoted by A, defined as following: 
for t e Ti„/({0, 1}), VA(t) = (N)* and for any xi,..,xi, labA{t){xi, ..,xi) = 
t(PirP^r..P‘r). A contracts all the left edges of t. We also consider the par- 
tial converse A~^ defined on the set of trees of T°° whose domains are equal to 
(N)*. 

Let us set Tn = {t G Ti„/({0, 1}) | A{t) is isomorphic to t„}. We will see 
that the tree family T^ is definable, but not at a lower level than the n-th of 
the hierarchy. Let be the Turing reduction of 0” to RecIsom(t„) which has 
been constructed in the proof of Lemma 9; let us note that for each integer k, 
= W* andthus (fe)) is defined. Now, for each integer k, we have: 



fc G 0” if and only if A ^(Tp„(fc)) G Tn. 
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On the other hand, one can verify that the family 7^ can be WMS2S-defined by 
a 77^-formula Lpn for a suitable Then A: G 0” if and only if ^ 

By a result similar to Lemma 10, one verifies that this last predicate is recursive 
in 0^, which implies that £ > n. Theorem 2 is proved. 
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Abstract. This paper reports on results concerning the combination 
of non-standard semantics via interpreters. We define what a semantics 
combination means and identify under which conditions a combination 
can be realized by computer programs (robustness, safely combinable). 
We develop the underlying mathematical theory and examine the mean- 
ing of several non-standard interpreter towers. Our results suggest a tech- 
nique for the implementation of a certain class of programming language 
dialects by composing a hierarchy of non-standard interpreters. 



1 Introduction 



The definition of programming language semantics from simpler, more elemen- 
tary parts is an intriguing question [6,11,17,18]. This paper reports on new results 
concerning the combination of semantics via non-standard interpreters. Instead 
of using the familiar tower of interpreters [13] for implementing the standard 
semantics of a programming language, we generalize this idea to implement the 
non-standard semantics of a programming language by combining one or more 
non-standard interpreters. 



(p) 



(p) ® 



intN 


new 


nintN 


intL 




nintL 



Standard Hierarchy Non-Standard Hierarchy 



The essence of the interpreter tower is to evaluate an A-interpreter intN 
written in L by an ^interpreter intL written in some ground language. This 
means, we give standard semantics to A-programs via L’s standard semantics. 
But what does it mean to build a tower involving one or more non-standard in- 
terpreters? For example, what does it mean for the semantics of an A-program p 
if we replace interpreter intL by an inverse-interpreter nintLl 
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A formal answer to this question and related non-standard towers will be 
given in this paper. Using the mathematical foundations developed here some 
well-known results about the combination of standard interpreters are shown, as 
well as new results about the combination of non-standard semantics. This ex- 
tends our previous work on semantics modifiers [1] and inverse computation [2] 
where we have observed that some non-standard semantics can be ported via 
standard interpreters. We can now formalize a class of non-standard seman- 
tics that can serve as semantics modifiers, and reason about new non-standard 
combinations, some of which look potentially useful. We focus on deterministic 
programming languages as an important case for practice. Since this includes 
universal programming languages, there is no loss of generality. Extending our 
results to other computation models can be considered for future work. 

In practice, the implementation of a non-standard tower will be inefficient be- 
cause each level of interpretation adds extra computational overhead. To improve 
efficiency we assume program specialization techniques. Program specialization, 
or partial evaluation [9,5], was shown to be powerful enough to collapse towers 
of standard interpreters and to drastically reduce their interpretive overhead. 
We believe powerful program transformation tools will enable us in the future 
to combine non-standard interpreters with less concern about efficiency, which 
may make this approach more practical for the construction of software. 

Finally, note that we use the term ‘programming language’ in a broad sense, 
that is, not only for universal programming languages, such as Fortran, C or 
ML, but also for domain- specific languages {e.g., networks, graphics), and for 
languages which are computationally incomplete {e.g., regular grammars). This 
means, potentially our results apply to a broad spectrum of application areas. 

The main contributions of this paper are: (i) a mathematical foundation for 
a theory about semantics combination: we define what semantics combinations 
mean and we identify several theoretical combinations; (ii) an approach to im- 
plementing programming language dialects by building towers of non-standard 
interpreters and their correctness; (iii) explaining the essence of known results 
such as interpreter towers and the Futamura projections [7], and giving novel 
insights regarding the semantics modification of programming languages. Proofs 
are omitted due to space limitations. 

2 Foundations for Langnages and Semantics 

Before introducing languages and semantics we give two preliminary definitions. 
We define a projection (A.b) to form new sets given a set of tuples A and an ele- 
ment b, and a preserving set definedness relation (A $ B) which requires A G B 
and A to be non-empty when B is non-empty. 

Definition 1 (projection). Let A, B, C be sets, let A C B x C , and let b G B, 
then we define projection A.b { c' | (6', c') G A, b' = b } . 



Example 1. Let A = {(2, 3, 1), (5, 6, 7), (2, 4, 1)} then A.2 = {(3, 1), (4, 1)} . 




Combining Semantics with Non-standard Interpreter Hierarchies 203 



Definition 2 (preserving set definedness). Let A, B be sets, then we define 
relation preserving set definedness {A B) {{ACB)f\{Bf^tt>^Af^tt>)). 

We define languages, semantics and functional equivalence using a relational ap- 
proach. When we speak of languages we mean formal languages. As is customary, 
we use the same universal data domain {D) for all languages and for representing 
all programs. Mappings between different data domains and different program 
representations are straightforward to define and not essential for our discussion. 

The reader should be aware of the difference between the abstract language 
definitions given in this section, which may be non-constructive, and the defini- 
tions for programming languages in Sect. 3 which are constructive. The formal- 
ization is geared towards the definition of deterministic programming languages. 



Definition 3 (language). A language L is a triple L = | ]^), where 

Pl Q D is the set of L-programs, Di = D is the data domain for L, and | ]^ 
the semantics of L: C P^ x D x D . We denote by C the set of all languages. 

Definition 4 (program semantics, application). Let L = {Pl,D, 11^) be 
a language, let p G Pl be an L-program, let d G D be data, then the semantics 
of p is defined by | ]^.p C D x D, and the application of p to d by I C D. 

Definition 5 (functional equivalence). Let Li = (PL,,i?,IkJ and L2 = 
(Pl 2 j I 1l2) languages, let p\ G Pli and p2 G Plq. be programs, then p\ and 
P2 are functionally equivalent iff | = | • 

Note that we defined the semantics | ]^ of a language as a relation {Pl xDxD). 
For convenience, we will sometimes use notation |p]^ d for application | J^.p.d, 
and notation |p]^ for program semantics | j^.p . As Def. 4 shows, the result of 
an application is always a set of data, and we can distinguish three cases: 

|p] ^ d = 0 — application undefined, 

\p\hd = {a} — application defined (deterministic case), 

[pJl d = {fli, 02, • ■ •} — application defined (non-deterministic case). 

Definition 6 (deterministic language). A language L = {Pl, D, | ]^) is de- 
terministic ijfy{pi, di, ai), {P2, d2, 02) e [ li : (pi = P2 A di = d2) ^ (ai = 02). 
We denote by V the set of all deterministic languages (T> C £). 

Relations C and ® have a clear meaning for application: d C d tells 

us that the left application may be undefined even when the right application 
is defined {definedness is not preserved); d ® \p\l^ d tells us that both 

applications are either defined or undefined {definedness is preserved). In Def. 7 
we use ® to define a definedness preserving relation between semantics (® ). 

Definition 7 (preserving semantics definedness). Let L\ = {Pl„,D, I]^,^) 
and L2 = {Pl2j D I Ilia) be languages such that Pl^ = Pl^, then we define rela- 
tion preserving semantics definedness (® ) as follows: 

(I 1li® I IL2) ^ (yp S Vd G D: I l^^.p.d i | J^^.p.d) . 
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2.1 Semantics Properties and Language Dialects 

A property 5 is a central concept for the foundations of non-standard semantics. 
It specifies a modification of the standard semantics of a language. When we 
speak of an S -dialect L' of a language L, then the relation of input/output of all 
i^programs applied under L' must satisfy property S. For example, we require 
that the output of applying an i^program under an inverse-dialect L' of L [2] is 
a possible input of that program applied under L’s standard semantics. Given a 
request r for b'-computation, there may be infinitely many answers a that satisfy 
property S} We consider each of them as a correct wrt S. 

A property describes a semantics modification for a set of languages. The 
specification can be non-constructive and non-deterministic. We specify a prop- 
erty S for a set of languages as a set of tuples (L, p, r, a). We say a language L' is 
an b-dialect of L if both languages have the same syntax, and the semantics of L' 
is a subset of b.L . We define three types of dialects that can be derived from S. 
Later in Sect. 3 we consider only those dialects that are constructive. 

Definition 8 (property). Let Af C £, then set b is a property for Af iff 

5 C [j {L} X Pl X D X D . 



Example 2 (properties). Let Af = C and R G C, then Id, Inv, Trans r and 
Copy are properties for C, namely identity, inversion, translation, and copying 
of programs. Other, more sophisticated properties may be defined that way. 

=*'{ (L,p,r, a) | L G C,p G Pl,t G D , a G \p\^r } 

Inv'^^ {{L,p,r,a) \ L G L,p G Pr,a G D,r G\p}j^a} 

ritfsf 

Trans R = { {L,p,r,p') \ L G £,p G PR,r G D,p' G Pr : lpj^ = |p'l^ } 

Copy =‘' {{L,p,r,p) \ L G £,p G Pl,t G D } 



Definition 9 (dialects). Let S be a property for Af, let L G Af, then S.L C 
Pr X D X D is the most general b-semantics for L. Let L = {Pr, D, | ]^), then 
a language L' = {Pr,D I I 1l') G £ is (i) the most general b-dialect of L iff 
I 1l' = S.L , (a) an b-dialect of L Zjff | ]^,® b.L , and (Hi) an b-semi-dialect 
oi L iff \ ]^, C S.L . We denote by S\L the most general b-dialect of L and by 
T>s\l the set of all deterministic b-dialects of L. 

The most general b-semantics S.L specifies all correct answers for an application 
S.L.p.r given b, L, p, r. In general, the most general dialect b|L of a language L 
will be non-deterministic. This allows the definition of different b-dialects for L. 



^ When we talk about non-standard semantics, we use the terms request and answer 
to distinguish them from input and output of standard computation. 
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Example 3 (dialects). There are usually infinitely many deterministic and non- 
deterministic /nv-dialects of L (they differ in which and how many inverse an- 
swers they return). For property Copy, the most general dialect Copy\L is always 
deterministic and there exists only one Copy-dialect for each L. Another example 
is property Id. If L is non-deterministic, then there are usually infinitely many 
deterministic and non-deterministic /d-dialects. But if L is deterministic, then 
there exists only one deterministic W-dialect L' and L' = Id\L = L. 

Definition 10 (robnst property). Let M he a set of languages, let S be a 
property for Af, then S is robust ijf all functionally equivalent programs are also 
functionally equivalent under the most general S -dialect: 

VLi.La G Af Vpi G Pl, Vp2 G Pl^ ' {lPl^L^ = Ip4l,) ^ (Ipi1s|Li= IP2]s\l,) ■ 

Example 4 (robustness). All properties in Ex. 2 are robust {Id,Inv, Trans r), 
except Copy, which returns different results for fct. equivalent programs p ^ p' . 

The motivation for defining robustness is that we are mainly interested in a 
class of properties that can be combined by interpreters. When we use a robust 
property S we cannot distinguish by the semantics of the most general dialect S\L 
two programs which are functionally equivalent under L’s standard semantics. 
A robust property specifies an extensional modification of a language semantics 
which is independent of the particular operational features of a program. 

2.2 Combining Properties 

Two properties S' and S" can be combined into a new property S'oS". In- 
tuitively speaking, one gets an (5"o6'")-dialect of a language L by taking an 
5"-dialect of an ^''-dialect of L. This combination is captured by projection 
S' .L" .p.r in the following definition. The reason for choosing language L" from 
the set of deterministic 5"'-dialects Vs"\l of L is that later we will use deter- 
ministic programming languages for implementing property combinations. 

Definition 11 (combination). Let S', S" be properties for T> , then we define 

S'oS"‘^ {{L, p,r,a) \ L £T),p £ Pr, r € D, a € D, L" € 'Ds"\l, « G S'. L". p.r} . 



Example 5 (combination) . Let 5 be a property for V, then some of the combi- 
nations of the properties in Example 2 are as follows: 



S O Id = s 


: Right combination with identity does not change property S 


Id o S = S 


: Left combination with identity does not change property S. 


Trans R o S = S. Trans r 


: S-translation to R (will be explained in Sec. 4.3) 


Inv o S = 5-1 


: Inversion of property S (will be explained in Sec. 4.4). 


Copy o S = Copy 


: “Left zero” for property S. 
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In addition, we are interested in combinations {S'oS”) that guarantee that all 
applications S' .L" .p.r are defined for the same set of program-request pairs (p, r) 
regardless which deterministic 5"'-dialect L" we select for L. This requires that 
S' and S" satisfy the condition given in the following definition. In this case we 
say, S' and S" are safely combinable (S'fxiS"). 

Definition 12 (safely combinable). Let S' , S" be properties for V, then S' 
is safely combinable with S" (S'txiS") iff 

VL G V, yi'l, L" G Vp e PL,yd€D : 

{S'.L'{.p.d yf 0) {S'.L''.p.d yf 0) . 



Example 6 (safely combinable). Let S',S" be properties, and let S' be robust, 
then the following combinations are always safely combinable: 



IdodS" 


5 


S't<Ld 


5 


S' ex Copy 



3 Programming Languages 

We now turn to programming languages, and focus on deterministic program- 
ming languages as an important case for practice. Since this includes universal 
programming languages, there is no loss of generality. All computable functions 
can be expressed. First, we give definitions for programming languages and inter- 
preters, then we introduce non-standard interpreters which we define as programs 
that implement non-standard dialects. 

As before we assume a universal data domain D for programming languages, 
but require D to be constructive (recursively enumerable) and to be closed under 
tupling: di, . . . , dk & D ^ [ di, . . . , dk] & D . For instance, a suitable choice for 
D is the set of S-expressions familiar from Lisp [13]. Since we consider only 
deterministic programming languages, the result of an application is either a 
singleton set or the empty set. 

Definition 13 (programming language). A programming language L is a 
deterministic language L = {Pl, Dl, | ]^) where Pl C D is the recursively enu- 
merable set of L-programs, = D is the recursively enumerable data domain 
for L, and | is the recursively enumerable semantics of L: C P^ x D x D . 

We denote by V the set of all programming languages. 

Definition 14 (interpreter). Let L = {Pl, D,l]^), M = {Pm, D,l]j^) be 
programming languages, then an M -program intL is an interpreter for L in M ijf 

Vp e Pl, yd e D : lintLjj^ [p,d] = lpj^d . 

Definition 15 (partially fixed argument). Let L = {Pl,D, | ]^) be a pro- 
gramming language, let p,p' G Pl, and let di € D such that 

Vd2 G D : [p'li d2 = Ml [ di, ^2 ] ■ 
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If program p' exists we denote it hy p, [ di, • ] ] ”, and we have 

Vd2 e : [ [p, [ di,*] ] d2 = Ml [ di, d2 ] ■ 

In a universal programming language we can always write program [p,[ di, • ] ] 
given p G Pl and di € D (this is similar to Kleene’s S-m-n theorem). In a pro- 
gramming language that supports abstraction and application as in the lambda- 
calculus we can define: [ p, [ di, • ] ] = Xd 2 -p [^ 1 ,^ 2 ]- 

Definition 16 (prog. lang. dialects). LetV' C V , let S he a property for V' , 
and let L = {Pl, D, | ]^) G V' , then a prog, language L' = {Pl, D, | ]^,) G V is 
an b'-dialect of L ijf I ]^,® | ; otid an b'-semi-dialect of L iff C . 

Definition 17 ( 5 |L/M- interpreter). Let L, M he programming languages, let 
V' C V , let S he a property for V' , and let L G V' , then an M -program ninth is 
an b'-interpreter for L in M (b|L/M-interpreter) if there exists an S -dialect L' 
of L such that ninth is an interpreter for L' in M . 

An interpreter for a language L is an implementation of the standard semantics of 
L, while an b-interpreter is an implementation, if it exists, of an b-dialect L' of L. 
Since a property b may specify infinitely many b-dialects for L (see Sect. 2.1), we 
say that any program that implements one of these dialects is an b-interpreter.^ 
In general, not every non-standard b-dialect is computable. Some dialects 
may be undecidable, others (semi-)decidable. A non-standard interpreter ninth 
realizes an b-dialect for a given language L, and having ninth, we can say that b 
can be realized constructively for L. If this is the case for two properties S' and 
S" , then (S'oS") can be implemented by a tower of non-standard interpreters. 

4 Towers of Non-standard Interpreters 

Definition 18 (non-standard tower). Let V' C V, let M gV , let N , L G V , 

let S' , S" he properties for V' , let S' he robust, let M -program ninth' he an S' - 
interpreter for L in M, let L-program nintN" he an S" -interpreter for N in L, 
and let p G Pn , d G D, then a non-standard tower is defined hy application 

{ninth!} [ [ nintN", [p, •]], d] . 



Theorem 1 (correctness of non-standard tower). LetV' C V, let M GV, 

let N,L G V , let S', S" he properties for V' , let M -program ninth' he an S' - 
interpreter for L in M , let L-program nintN" he an S" -interpreter for N in L: 

— If S' is robust then the following non-standard tower implements an {S'oS")~ 
semi-dialect of A in M (cf. Fig. 1): 

Vp G Pjv, yd G D : {ninth'lj^^ [ [ nintN", [p, •]], d] {S'oS").N .p.d . 

^ In general, deterministic programs cannot implement all S-dialects since some di- 
alects may be non-deterministic [e.g., /w-dialects). 
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N - - 
L - - 
M - - 




nintN” 



nintL' 



any S” 
robust S' 




safely 

combinable 

S'txS" 



Tower implements: S -semi-dialeet S -dialect 

Fig. 1. Two-level non-standard tower 



— If S' is robust and safely combinable with S" (S'fxiS") then the following 
non-standard tower implements an (5"o5'")-dialect of in M (cf. Fig. 1): 
Vp G Pm, yd £ D : [ [ nintN" , [ p, • ] ], d ] ® {S'oS").N .p.d . 

The theorem guarantees that a non-standard tower consisting of an ^-interpreter 
ninth' and an ^''-interpreter nintN" returns a result (if defined) that is correct 
wrt S'oS", provided property S' is robust. Regardless of how the two interpreters 
are implemented, we obtain an implementation of (at least) an (S'oS")-semi- 
dialect. If in addition S' and S" are safely combinable (5"ixi5'"), we obtain an 
implementation of an (S' oS") -dialect. In contrast to the mathematical combina- 
tion of two properties (Sect. 2.2), a combination of two non-standard interpreters 
requires that the source language of ninth' and the implementation language of 
nintN" match {i.e. language L). This is illustrated in Fig. 1. We showed which 
properties are robust (Sect. 2.1) and which are safely combinable (Sect. 2.2). 

Figure 2 summarizes relation safely combinable for combinations of proper- 
ties defined in Ex. 2. For Trans r we assume R is a universal language. Property 
Inv is not always safely combinable. While some properties S' and S" are not 
safely combinable for all languages, they may be safely combinable for some 
languages. Two cases when properties are safely combinable for a subset of T>: 

1. Only one S" -dialect exists for N . For instance, for property Inv this condi- 
tion is satisfied for programming languages in which all programs are injec- 
tive (this is not true for all programming languages). 

2. Property S' is total for N". For example, if R is a universal programming 
language in property Trans r, then every source program can be translated 
to R. Thus, Trans R is totally defined. More formally. S' is a total property 
for N" if we have: Vp G Pm, Wd £ D : S'. N". p.d yf 0. 

We now examine several semantics combinations and their non-standard towers. 
The results are summarized in Fig. 3. (Multi-level towers can be constructed by 
repeating the construction of a two-level tower.) 

4.1 Classical Interpreter Tower 

Two classical results about standard interpreters can be obtained in our frame- 
work using two facts: (i) property Id is robust (Sect. 2.1), and (ii) Id is safely 
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S'lxib" 


Id 


Inv 


Trans q 


Copy 


Id 


Yes 


Yes 


Yes 


Yes 


Inv 


Yes 


No 


No 


Yes 


Trans R* 


Yes 


Yes 


Yes 


Yes 



R is a universal programming language 



Fig. 2. S' and S" are safely combinable 



combinable with any property S for T>: Idt><S (Sect. 2.2). We also observe that 
an interpreter intL is an W-interpreter because W|L = L is an W-dialect of L, 
and accord, to Def. 17 intL is an interpreter for this W-dialect. Thus we have: 

Corollary 1 (W-interpreter). Let L,M GV, and let M -program intL he an 
interpreter for L in M, then intL is an Ld -interpreter for L in M. 



Id o Ld = Ld \ Since we consider only deterministic programming languages, 
there exists only one deterministic W-dialect, and since Idt<ild is safely com- 
binable, we can build the following non-standard tower consisting of an L/ M- 
interpreter intL and an V/Z^interpreters intN: 



Vp G Pn, yd G D : [ [ intN ,[ p,»]], d] = lpjj^d . 



It is easy to see that this combination is the classical interpreter tower. The key 
point is that the semantics of N is preserved by combination Id o Id. Property 
Id can be regarded as identity operation in the algebra of semantics combination. 
Id o S = S More generally, any b'-interpreter nintN for N in L can be evalu- 



ated in M given an W-interpreter intL for L in M . The non-standard tower is a 
faithful implementation of an b-dialect in M . Not surprisingly, an b-interpreter 
can be ported from L to M using an W-interpreter intL. 



Vp G Pn, yd G D : lintLjj^ [ [ nintN, [ p, • ] ], d ] ® S.N.p.d = Iplgjjv d . 



4.2 Semantics Modifiers 



A novel application of W-interpreters can be obtained from combination S o Id. 
S o Id = S If property b for T> is robust then Son Id is safely combinable, and 



we can write the following non-standard tower consisting of an W-interpreter 
intN for N in L and an b-interpreter nintL for L in M: 



Vp G Pn, yd G D : InintLjj^^ [ [ intN, [ p, • ] ], d ] ® S.N.p.d = Iplgjjv d . 

The equation asserts that an b-interpretation of A-programs can be performed 
by combining an W-interpreter for N in L and an b-interpreter for L. The non- 
standard tower implements an b-interpreter for N . Every b-interpreter captures 

















210 



Sergei Abramov and Robert Gliick 



S' o s" 


Id 


Inv 


Trans q 


Copy 


Id 


Id 

int-tower 


Inv 

porting 


Trans q 
porting 


Copy 

porting 


Inv 


Inv 

semmod 


Id 

identity 


Certq 

certifier 


Recog 

recognizer 


Trans R 


Trans R 
semmod 


InvTrans R 
inverter 


Gq/r 


ArcliR 

archiver 



Fig. 3. Examples of property combinations 



the essence of b'-computation regardless of its source language. This is radically 
different from other forms of program reuse because all interpreters implementing 
robust properties can be ported to new programming languages by means of Id- 
interpreters. In other words, the entire class of robust properties is suited as 
semantics modifiers [1]. This idea was demonstrated for the following examples. 

Inv o Id = Inv Since Inv is a robust property (Sect. 2.1), we can reduce the 



problem of writing an Inv-interpreter for N to the simpler problem of writing 
an Id-interpreter for N in L, provided an inverse interpreter for L exists. For 
experimental results see [16,1,2]. 

Trans R o Id = Trans r A translator is a classical example of an equivalence 
transformer. Since property Trans r is robust for all universal programming lan- 
guages R, this equations asserts that translation from N to R can be performed 
by combining a standard interpreter for N in L and a translator from L to R. 
A realization of this idea are the Futamura projections [7]: it was shown [9] that 
partial evaluation can implement this equation efficiently (for details see also [1]). 



4.3 Non-standard Translation 



Trans R o S = S -Trans r where b is a property for V and 



rt rji d©f 

o - I vans fi = 



{{L,p,r,p') I LgV,p € PL,r G D,L' €Vs\l,p' ^ Pr,Ip'Ir = Mr,} 



This combination describes the semantics of translating an deprogram p into a 
standard b-program p' which is functionally equivalent to p evaluated under 
a deterministic b-dialect of N. In other words, non-standard computation of 
p is performed by standard computation of p' in R. We say S-TransR is the 
semantics of S -compilation into R. This holds regardless of b. We have already 
met the case of Id-translation (Sect. 4.2). Let us examine two examples: 




deprogram p into a “self-extracting archive” written in R. 
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4.4 Semantics Inversion 



Inv o S = S 



where 5 is a property for T> and 



S = { {L,p, a,r) 
S~^ = { {L,p,r,a) 



LgT),pGPl, aGD,rG S.L.p.a } 
LGT>,pGPL,aGD,rG S.L.p.a } 



The combination describes the inversion of a property S . Three examples: 

Inv o Trans q = Cert q : semantics of a program certifier which, given Q-prog- 






ram p' and i^programs p, verifies whether p' is a translated version of p. 



Inv o Copy = Recog : semantics of a recognizer, a program checking whether 



two /^programs are textually identical - a rather simple-minded semantics. 



Inv o Inv = Id \ : the inverse semantics of an inverse semantics is the W-seman- 
tics (in general they are not safely combinable and a tower of two /nv-interpreters 
ensures only a semi-dialect). 



5 Related Work 



Interpreters are a convenient way for designing and implementing programming 
languages [13,6,14,19,10]. Early operational semantics [12] and definitional inter- 
preters [15] concerned the definition of one programming language using another 
which, in our terms, relies on the robustness of W-semantics. 

Monadic interpreters have been studied recently to support features of a pro- 
gramming language, such as profiling, tracing, and error messages {e.g., [11,18]). 
These works are mostly concerned with modifying operational aspects of a par- 
ticular language, rather than modifying extensional semantics properties of a 
class of languages. We studied language-independent conditions for analyzing 
semantics changes and provided a solid mathematical basis for their correctness. 

Meta-interpreters have been used in logic programming for instrumenting 
programs and for changing ways of formal reasoning [20,3]. These modifications 
usually change the inference rules of the underlying logic system, and in general 
do not attempt the deep semantics changes covered by our framework. 

Reflective languages have been advocated to enable programs to semantically 
extend the source language itself, by permitting them to run at the level of the 
language implementation with access to their own context [4,8]. The reflective 
tower [17] is the principle architecture of such languages. More should be known 
to what extent reflective changes can be captured by robust semantics properties. 

Experimental evidence for porting b'-semantics via W-interpreters (b o Id) 
has been given for inverse semantics (Inv) [16,1,2], and for translation semantics 
(Trans n) in the area of partial evaluation [9,5]. We are not aware of other work 
developing mathematical foundations for a theory about semantics combinations, 
but should mention related work [1] studying the subclass of semantics modifiers. 
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6 Conclusion and Future Work 

The semantics conditions we identified, allow us to reason about the combination 
of semantics on an abstract level without referring to a particular implementa- 
tion, and to examine a large class of non-standard semantics instead of particular 
instances {e.g., a specializer and a translator both implement a translation se- 
mantics Trans b). Our results suggest a technique for the implementation of 
a certain class of programming language dialects by composing a hierarchy of 
non-standard interpreters {e.g., inverse compilation by Transn o Inv). 

Among others, we can now answer the question raised in the introduction, 
namely what it means for the semantics of a language N if the implementation 
language L of its standard interpreter intN is interpreted in a non-standard 
way {S o Id =?). As an example we showed that an inverse interpretation of 
L implements an inverse interpreter for N (even though we have never written 
an inverse interpreter for N , only a standard interpreter intN). This is possible 
because Inv is a robust property that can be safely combined with Id. 

For some of the properties presented in this paper, practical demonstrations 
of their combination exist {e.g., Mold, Transn°M, Invold). In fact, for the first 
two combinations it was shown that partial evaluation is strong enough to achieve 
efficient implementations. It is clear that more experimental work will be needed 
to examine to what extent these and other transformation techniques can opti- 
mize non-standard towers, and to what extent stronger techniques are required. 
We presented a dozen property combinations. Which of these combinations will 
be useful for which application is another practical question for future work. 
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Abstract. We consider a general prescriptive type system with para- 
metric polymorphism and subtyping for logic programs. The property of 
subject reduction expresses the consistency of the type system w.r.t. the 
execution model: if a program is “well-typed” , then all derivations start- 
ing in a “well-typed” goal are again “well-typed”. It is well-established 
that without subtyping, this property is readily obtained for logic pro- 
grams w.r.t. their standard (untyped) execution model. Here we give 
syntactic conditions that ensure subject reduction also in the presence 
of general subtyping relations between type constructors. The idea is to 
consider logic programs with a fixed dataflow, given by modes. 



1 Introduction 

Prescriptive types are used in logic and functional programming to restrict the 
underlying syntax so that only “meaningful” expressions are allowed. This allows 
for many programming errors to be detected by the compiler. Godel [7] and 
Mercury [15] are two implemented typed logic programming languages. 

A natural stability property one desires for a type system is that it is con- 
sistent with the execution model: once a program has passed the compiler, it 
is guaranteed that “well-typed” configurations will only generate “well-typed” 
configurations at runtime. Adopting the terminology from the theory of the A- 
calculus [17], this property of a typed program is called subject reduction. For 
the simply typed A-calculus, subject reduction states that the type of a A-term 
is invariant under reduction. This translates in a well-defined sense to functional 
and logic programming. 

Semantically, a type represents a set of terms/expressions [8, 9]. Now subtyp- 
ing makes type systems more expressive and flexible in that it allows to express 
inclusions among these sets. For example, if we have types int and real, we might 
want to declare int < real, i.e., the set of integers is a subset of the set of reals. 
More generally, subtype relations such as list{u) < term make it possible to type 
Prolog meta-programming predicates [5], as shown in Ex. 1.4 below and Sec. 6. 

* A long version of this paper, containing all proofs, is available in [14]. 
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In functional programming, a type system that includes subtyping would 
then state that wherever an expression of type a is expected as an argument, 
any expression having a type a' < a may occur. The following example explains 
this informally, using an ad hoc syntax. 

Example 1.1. Assume two functions sqrt : real real and fact : int int 
which compute the square root and factorial, respectively. Then sqrt {fact 3) is 
a legal expression, since fact 3 is of type int and may therefore be used as an 
argument to sqrt, because sqrt expects an argument of type real, and int < real. 

Subject reduction in functional programming crucially relies on the fact that 
there is a clear notion of dataflow. It is always the arguments (the “input”) of a 
function that may be smaller than expected, whereas the result (the “output”) 
may be greater than expected. This is best illustrated by a counterexample, 
which is obtained by introducing reference types. 

Example 1.2. Suppose we have a function / : real ref real defined by 

let f{x) = X := 3.14; return x. So / takes a reference (pointer) to a real as 
argument, assigns the value 3.14 to this real, and also return 3.14. Even though 
int < real, this function cannot be applied to an int ref, since the value 3.14 
cannot be assigned to an integer. 

In the example, the variable x is used both for input and output, and hence 
there is no clear direction of dataflow. While this problem is marginal in func- 
tional programming, it is the main problem for subject reduction in logic pro- 
gramming with subtypes. 

Subject reduction for logic programming means that resolving a “well-typed” 
goal with a “well-typed” clause will always result in a “well-typed” goal. It holds 
for parametric polymorphic type systems without subtyping [9, 10].^ 

Example 1.3. In analogy to Ex. 1.1, suppose Sqrt/2 and Fact/2 are predicates 
of declared type (Real, Real) and (int, Int), respectively. Consider the program 

Fact (3,6) . 

Sqrt (6, 2. 45) . 
and the derivations 

Fact(3,x), Sqrt(x,y) Sqrt(6,y) □ 

Sqrt(6,x), Fact(x,y) Fact(2.45,y) 

In the first derivation, all arguments have a type that is less than or equal to the 
declared type, and so we have subject reduction. In the second derivation, the 
argument 2.45 to Fact has type Real, which is greater than the declared type. 
The atom Fact(2.45,y) is illegal, and so we do not have subject reduction. 

Here we address this problem by giving a fixed direction of dataflow to logic 
programs, i.e., by introducing modes [1] and replacing unification with double 
matching [2], so that the dataflow is always from the input to the output positions 
in an atom. We impose a condition on the terms in the output positions, or more 

^ Note however that the first formulation of subject reduction [10] was incorrect [8]. 
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precisely, on the types of the variables occurring in these terms: each variable 
must have exactly the declared (expected) type of the position where it occurs. 

In Ex. 1.3, let the first argument of each predicate be input and the second 
be output. In both derivations, x has type Int. For Fact(3,x), this is exactly 
the declared type, and so the condition is fulfilled for the first derivation. For 
Sqrt(6, x), the declared type is Real, and so the condition is violated. 

The contribution of this paper is a statement that programs that are typed 
according to a type system with subtyping, and respect certain conditions con- 
cerning the modes, enjoy the subject reduction property, i.e., the type system 
is consistent w.r.t. the (untyped) execution model. This means that effectively 
the types can be ignored at runtime, which has traditionally been considered as 
desirable, although there are also reasons for keeping the types during execu- 
tion [11]. In Sec. 6, we discuss the conditions on programs. 

There are few works on prescriptive type systems for logic programs with 
subtyping [3, 4, 5, 6, 8]. Hill and Topor [8] give a result on subject reduction 
for systems without subtyping, and study general type systems with subtyp- 
ing. However their results on the existence of principal typings turned out to 
be wrong [3]. Beierle [3] shows the existence of principal typings for systems 
with subtype relations between constant types, and provides type inference al- 
gorithms. Beierle and also Hanus [6] do not claim subject reduction for their 
systems. Fages and Paltrinieri [5] have shown a weak form of subject reduction 
for constraint logic programs with subtyping, where equality constraints replace 
substitutions in the execution model. 

The idea of introducing modes to ensure subject reduction for logic programs 
was proposed previously by Dietrich and Hagl [4] . However they do not study the 
decidability of the conditions they impose on the subtyping relation. Furthermore 
since each result type must be transparent (a condition we will define later), 
subtype relations between type constructors of different arities are forbidden. 

Example 1-4- Assume types Int, String and List(u) defined as usual, and a 
type Term that contains all terms (so all types are subtypes of Term). Moreover, 
assume Append as usual with declared type (List(u), List(u), List(u)), and a 
predicate Functor with declared type (Term, String), which gives the top func- 
tor of a term. In our formalism, we could show subject reduction for the query 
Append) [1], [], x), Functor(x, y), whereas this is not possible in [4] because the 
subtype relation between List(lnt) and Term cannot be expressed. 

The plan of the paper is as follows. Section 2 mainly introduces the type sys- 
tem. In Sec. 3, we show how expressions can be typed assigning different types 
to the variables, and we introduce ordered substitutions, which are substitutions 
preserving types. In Sec. 4, we show under which conditions substitutions ob- 
tained by unification are indeed ordered. In Sec. 5, we show how these conditions 
on unified terms can be translated into conditions on programs and derivations. 
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(Par) 



Table 1. The subtyping order on types 
u < u M is a parameter 



( Constr) 



R(i)<t{ ... 



K < i — • 



2 The Type System 

We use the type system of [5]. First we recall some basic concepts [1]. When we 
refer to a clause in a program, we mean a copy of this clause whose variables are 
renamed apart from any other variables in the context. A query is a sequence 
of atoms. A query Q' is a resolvent of a query Q and a clause h ^ B ii Q = 
ai, . . . , Om, Q' = , Ok-i, B,Qk+i, ■ ■ ■ , am)0, and h and Ofc are unifiable 

with MGU 9. Resolution steps and derivations are defined in the usual way. 

2.1 Type Expressions 

The set of types T is given by the term structure based on a finite set of con- 
structors 1C, where with each AT G /C an arity to > 0 is associated (by writing 
K/m), and a denumerable set U of parameters. A flat type is a type of the 
form K{ui, ... , Um), where K G K. and the Ui are distinct parameters. We write 
t[ct] to denote that the type r strictly contains the type cr as a subexpression. 

A type substitution 6> is a mapping from parameters to types. The domain 
of O is denoted by dom{0), the parameters in its range by ran{0). The set of 
parameters in a syntactic object o is denoted by pars{o). 

We assume an order < on type constructors such that: K/m < K' jm! implies 
TO > to'; and, for each K G 1C, the set {K' \ K < K'} has a maximum. Moreover, 
we associate with each pair K/m < K' /m! an injection lk.k' '■ {Ij • • • — > 

{!,..., to} such that lk,k" = i-k.k' ° I'K'.k" whenever K < K' < K" . This 
order is extended to the subtyping order on types, denoted by <, as the least 
relation satisfying the rules in Table 1. 

Proposition 2.1. If cr < r then aO < tO for any type substitution O. 

Proposition 2.2. For each type a, the set {r | cr < r| has a maximum, which 
is denoted by Max{a). 

For Prop. 2.2, it is crucial that K/m < K' jm' implies to > to'. For example, 
if we allowed for Emptylist/0 < List/1, then we would have Emptylist < 
List(T) for all r, and so Prop. 2.2 would not hold. Note that the possibility of 
“forgetting” type parameters, as in List/1 < Anylist/0, may provide solutions 
to inequalities of the form List(u) < u, e.g. u = Anylist. However, we have: 

Proposition 2.3. An inequality of the form u < t[u] has no solution. An in- 
equality of the form t[u] < u has no solution if u G vars{Max{r)). 
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Table 2. The type system. 



(Var) 

(Func) 

(Atom) 

(Headatom) 

( Query) 

( Clause) 



{x ■. T, . . .} \- X ■. T 

UHf.aj ai<nO (ie{l,...,n}) 



U\~ fr-Y-.-Tn — 
U\~U:ai ai<nO 


U^Pri...Tn 


i,...,t„)Atom 


UHf.a^ a^<Ti (iG{l,...,n}) 




..,tn) Headatom 


U\~Ai Atom . 


... U\~An Atom 


C/hAi,..., 


A„ Query 


U\-Q Query 


U\-A Headatom 



U\-A^Q Clause 



0 is a type substitution 



0 is a type substitution 



2.2 Typed Programs 

We assume a denumerable set V of variables. The set of variables in a syntactic 
object o is denoted by vars{o). We assume a finite set T (resp. V) of function 
(resp. predicate) symbols, each with an arity and a declared type associated 
with it, such that: for each f G tF, the declared type has the form (ti, . . . , r„, r), 
where n is the arity of /, (n, . . . , r„) G T”, r is a flat type and satisfies the trans- 
parency condition [8]: pars{ri, . . . ,Tn) Q pars{r); for each p G V, the declared 
type has the form (ri, . . . , r„), where n is the arity of p and (ri, . . . , r„) G T”. 
The declared types are indicated by writing and We assume 

that there is a special predicate symbol =u,u where n gU. 

We assume that 1C, T , and V are fixed by declarations in a typed program, 
where the syntactical details are insignificant for our results. In examples we 
loosely follow Godel syntax [7] . 

A variable typing is a mapping from a finite subset of V to T, written as 
{xi : Ti, . . . ,x„ : Tn}- The restriction of a variable typing U to the variables in 
o is denoted as U\o- The type system, which defines terms, atoms etc. relative 
to a variable typing U, consists of the rules shown in Table 2. 

3 The Subtype and Instantiation Hierarchies 

3.1 Modifying Variable Typings 

We now show that if we can derive that some object is in the typed language 
using a variable typing U, then we can always modify U in three ways: extending 
its domain, instantiating the types, and making the types smaller. 

Definition 3.1. Let U, U' be variable typings. We say that U is smaller or 
equal U' , denoted U < U' , iiU = {x\ : n, . . . , a;„ : r„}, U' = {xi : t[,. . . ,Xn '■ 
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T^}, and for all z G {1, . . . , n}, we have Ti < r'. We write U' 3< U if there exists 
a variable typing U” such that U' 3 U" and U” < U. 



Lemma 3.1. Let U, U' be variable typings and 0 a type substitution such 
that U' 3< U0. li U \- t \ a, then U' \~ t \ a' where a' < aO. Moreover, if 
U \- A Atom then U' \~ A Atom, and if U \~ Q Query then U' \- Q Query. 

3.2 Typed Substitutions 

Typed substitutions are a fundamental concept for typed logic programs. 

Definition 3.2. If U h x\ = t\, . . . ,Xn = tn Query where xi,. . . ,Xn are dis- 
tinct variables and for each i G {1, . . . ,n}, U is a term distinct from Xi, then 
{{xi/ti, . . . ,Xn/tn},U) is a typed (term) substitution. 

To show that applying a typed substitution preserves “well-typedness” for 
systems with subtyping, we need a further condition. Given a typed substitution 
{9, U), the type assigned to a variable xhy U must be sufficiently big, so that it 
is compatible with the type of the term replaced for xhy 9. 

Example 3.1. Consider again Ex. 1.3. Taking C/ = {x : Int,y : Int}, we have 
[/Lx: Int, U h 2.45 : Real, and hence [/ h x = 2.45 Atom. So ({x/2.45}, U) is 
a typed substitution. Now U h Fact(x,y) Atom, but U \f Fact(2.45,y) Atom. 
The type of x is too small to accommodate for instantiation to 2.45. 



Definition 3.3. A typed (term) substitution {{xi/rx, . . . , a;„/r„}, U) is an or- 
dered substitution if, for each z G {1, . . . ,n}, where Xi : Ti G U, there exists 
(Ti such that U \- ri : ai and Ui <Ti. 



We now show that expressions stay “well-typed” when ordered substitutions 
are applied [8, Lemma 1.4.2]. 

Lemma 3.2. Let {9, U) be an ordered substitution. If U \~ t : a then U \~ t9 : a' 
for some <j' < a. Moreover, if U \~ A Atom then U h A9 Atom, and likewise for 
queries and clauses. 



4 Conditions for Ensuring Ordered Substitutions 

In this section, we show under which conditions it can be guaranteed that the 
substitutions applied in resolution steps are ordered substitutions. 
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4.1 Type Inequality Systems 

The substitution of a resolution step is obtained by unifying two terms, say t\ 
and ^ 2 - In order for the substitution to be typed, it is necessary that we can 
derive U \~ t\ = t 2 Atom for some U. We will show that if U is, in a certain 
sense, maximal, then it is guaranteed that the typed substitution is ordered. 
We first formalise paths leading to subterms of a term. 

Definition 4.1. A term t has the subterm t in position e. Ift = /(ti,...,t„) 
and ti has subterm s in position then t has subterm s in position 

Example 4-1. The term F(G(C),H(C)) has subterm C in position 1.1, but also in 
position 2.1. The position 2.1.1 is undefined for this term. 

Let us write _ h t :< cr if there exist U and a' such that U \~ t \ a' and a' < a. 
To derive U \~ t\ = t 2 Atom, clearly the last step has the form 

U \- ti ■. T\ U \- t 2 '■ T 2 Ti < u6> T2 < u6> 

U \- ti =u_u t 2 Atom 

So we use an instance (u, u)6> of the declared type of the equality predicate, 
and the types of t\ and ^2 are both less then or equal to u0. This motivates the 
following question: Given a term t such that _ h t :< ct, what are the maximal 
types of subterm positions of t with respect to cr? 

Example 4-2. Let List/1, Anylist/0 G 1C where List(r) < Anylist for all r, 
and Nil^List(u),ConSu^List(u)^List(u) G Consider the term [x, [y]] (in usual 
list notation) depicted in Fig. 1, and let cr = Anylist. Each functor in [x, [y]] is 
introduced using Rule (Func). E.g., any type of Nil in position 2.1.2 is necessarily 
an instance of List(u^-^-^), its declared type.^ To derive that Cons(y, Nil) is a 
typed term, this instance must be smaller than some instance of the second 
declared argument type of Cons in position 2.1, i.e., List(u^'^). 

So in order to derive that [x, [y]] is a term of a type smaller than Anylist, 
we need an instantiation of the parameters such that for each box (position), 
the type in the lower subbox is smaller than the type of the upper subbox. 

We see that in order for _ h t :< cr to hold, a solution to a certain type 
inequality system (set of inequalities between types) must exist. 

Definition 4.2. Let t be a term and cr a type such that _ h t :< cr. For each 
position C where t has a non-variable subterm, we denote the function in this 
position by /‘’j; (assuming that the parameters in . . . , are 

fresh, say by indexing them with C). For each variable x G vars(t), we introduce 
a parameter (so ^ pars{a)). The type inequality system of t and cr is 

T{t, cr) = < cr} U {r^ * < rf | Position (A in t is non-variable} U 

{u^ < rf I Position (A in t is variable t}. 

^ We use the positions as superscripts to parameters in order to obtain fresh copies. 
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Fig. 1. The term [x, [y]] and associated inequalities 



A solution oiX{t,a) is a type substitution O such that dom{0) r\pars{a) = 0 
and for each t < t' G the inequality tO < t'O holds. A solution 0 to 

X(t, cr) is principal if for every solution 0 for 2{t, a), there exists a 0' such that 
for each t < t ' G a), we have t0 < t00' and t'0 < t'00' . 



Proposition 4.1. Let t be a term and cr a type, li U \~ t \< a for some variable 
typing [/, then there exists a solution 0 for T{t, a) (called the solution for 
X{t, a) corresponding to U) such that for each subterm t' in position ( in t, 
we have U \~ t' : t‘^0 if t' ^ V, and U \~ t' : u* 0 if t' € V. 

In the next subsection, we present an algorithm, based on [5], which com- 
putes a principal solution to a type inequality system, provided t is linear. In 
Subsec. 4.3, our interest in principal solutions will become clear. 



4.2 Computing a Principal Solution 

The algorithm transforms the inequality system, thereby computing bindings to 
parameters which constitute the solution. It is convenient to consider system of 
both inequalities, and equations of the form u = t. The inequalities represent 
the current type inequality system, and the equalities represent the substitution 
accumulated so far. We use ^ for < or =. 

Definition 4.3. A system is left-linear if each parameter occurs at most once 
on the left hand side of an equation/inequality. A system is acyclic if it does 
not have a subset {t\ ^ cti, ...,r„ ^ (j„} with pars{cfi) D pars{ri+i) yf 0 for all 
1 < t < n — 1, and pars{an) C\pars{Ti) yf 0. 



Proposition 4.2. If t is a linear term, then any inequality system X{t, a) is 
acyclic and left-linear. 
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By looking at Ex. 4.2, it should be intuitively clear that assuming linearity 
of t is crucial for the above proposition. 

We now give the algorithm. A solved form is a system / containing only 
equations of the form I = {ui = t\, ..., Un = r„} where the parameters Ui are all 
different and have no other occurrence in I . 

Definition 4.4. Given a type inequality system T{t,cr), where t is linear, the 
type inequality algorithm applies the following simplification rules: 

(1) {A:(ri,...,Tm) < I — > {r,(i) < r' U / 

if K < K' and i = lk,k' 

(2) {u < u}U I — > / 

(3) {u < t}U I — > {m = r} U I[u/t] 
if T yf M, M ^ vars{r). 

(4) {t < u} U / — > {m = Max{r)} U I[u/Max{T)] 

if T ^ y, M ^ vars{Max{T)) and u ^ vars{l) for any I < r € S. 

Intuitively, left-linearity of T(t, a) is crucial because it renders the binding of 
a parameter (point (3)) unique. 

Proposition 4.3. Given a type inequality system I(t, a), where t is linear, the 
type inequality algorithm terminates with either a solved form, in which case 
the associated substitution is a principal solution, or a non-solved form, in which 
case the system has no solution. 



4.3 Principal Variable Typings 

The existence of a principal solution 6> of a type inequality system I{t,a) and 
Prop. 4.1 motivate defining the variable typing U such that 0 is exactly the 
solution of T{t,a) corresponding to U. 

Definition 4.5. Let _ h t :< cr, and 6> be a principal solution of I{t^c7). A 
variable typing U is principal for t and a if U {x : u^O \ x G vars{t)}. 

By the definition of a principal solution of 2{t,a) and Prop. 4.1, if {7 is a 
principal variable typing for t and a, then for any U' such that U'{x) > U(x) 
for some x G varsft), we have U' \f t :< a. 

The following key lemma states conditions under which a substitution ob- 
tained by unifying two terms is indeed ordered. 

Lemma 4.4. Let s and t be terms, s linear, such that U \~ s :< p, U \~ t :< p, 
and there exists a substitution 9 such that s9 = t. Suppose U is principal for s 
and p. Then there exists a type substitution 0 such that for U' = U0\yars(s) 
U C^(v\«ars(s)) we have that {9,U') is an ordered substitution. 
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Example 4-3- Consider the term vectors (since Lemma 4.4 generalises in the 
obvious way to term vectors) s = (3,x) and t = (3,6), let p = (lnt,Int) and 
[/s = {x : Int}, Ut = % (see Ex. 1.3). Note that Ug is principal for s and p, and 
so ({x/6}, Us U Ut) is an ordered substitution (6> is empty). 

In contrast, let s = (6, x) and t = (6, 2.45), let p =■ (Real, Real) and [/« = {x : 
Int}, Ut = 0. Then Us is not principal for s and p (the principal variable typing 
would be {x/Real}), and indeed, there exists no 0 such that ({x/2.45}, UsOUUt) 
is an ordered substitution. 

5 Nicely Typed Programs 

So far we have seen that matching, linearity, and principal variable typings are 
crucial to ensure that unification yields ordered substitutions. Note that those 
results generalise in the obvious way from terms to term vectors. We now define 
three corresponding conditions on programs and the execution model. 

First, we define modes [1]. For p/n G V, a, mode is an atom p{m\, . . . , rrin), 
where mt G {/, 0} for i G {!,..., n|. Positions with I (resp. O) are called input 
(resp. output) positions of p. We assume that a mode is associated with each 
p G V ■ The notation p{s,i) means that s (resp. t) is the vector of terms filling 
the input (resp. output) positions of p{s,f). Moded unification is a special case 
of double matching [2]. 

Definition 5.1. Consider a resolution step where p{s,t) is the selected atom 
and p{w,v) is the renamed apart clause head. The equation p{s,t) = p{w,v) is 
solvable by moded unification if there exist substitutions 9\, 02 such that 
w9i = s and vars(t9i) IT vars{v9\) = 0 and i.9i92 = v9\. A derivation where all 
unifications are solvable by moded unification is a moded derivation. 

Definition 5.2. A query Q = pi(si, ti), . . . ,p„(s„, t„) is nicely moded if 
ti, . . . , is a linear vector of terms and for alH G {1, . . . , nj 

n 

vars(si) n [J vars{tj) = 0. (1) 

3=i 

The clause C = p{to, s„+i) ^ Q is nicely moded if Q is nicely moded and 

n 

vars{io) T vars{tj) = 0. (2) 

i=i 

An atom p{s,t) is input-linear if s is linear, output-linear if t is linear. 

Definition 5.3. Let C = Pro,a„+i_(io, Sn+i) ^ pl^^f^{si,ii), ■ ■ ■ ,Pff„,f„i^n,in) 
be a clause. If C is nicely moded, to is input-linear, and there exists a variable 
typing U such that U \~ C Clause, and for each i G {0, . . . , n|, U is principal for 
ti and r', where r' is the instance of ft used for deriving U \~ C Clause, then 
we say that C is nicely typed. A query Uq : Q is nicely typed if the clause 
Go ^ Q is nicely typed. 
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We can now state the main result. 

Theorem 5.1. Let C and Q he a nicely typed clause and query. If Q' is a 
resolvent of C and Q where the unification of the selected atom and the clause 
head is solvable by moded unification, then Q' is nicely typed. 

Example 5.1. Consider again Ex. 1.3. The program is nicely typed, where the 
declared types are given in that example, and the first position of each predicate 
is input, and the second output. Both queries are nicely moded. The first query is 
also nicely typed, whereas the second is not (see also Ex. 4.3). For the first query, 
we have subject reduction, for the second we do not have subject reduction. 

6 Discussion 

In this paper, we have proposed criteria for ensuring subject reduction for typed 
logic programs with subtyping under the untyped execution model. Our starting 
point was a comparison between functional and logic programming: In functional 
programs, there is a clear notion of dataflow, whereas in logic programming, 
there is no such notion a priori, and arguments can serve as input arguments 
and output arguments. This difference is the source of the difficulty of ensuring 
subject reduction for logic programs. 

It is instructive to divide the numerous conditions we impose into four classes: 
(1) “basic” type conditions on the program (Sec. 2), (2) conditions on the ex- 
ecution model (Def. 5.1), (3) mode conditions on the program (Def. 5.2), (4) 
“additional” type conditions on the program (Def. 5.3). 

Concerning (1), our notion of subtyping deserves discussion. Approaches dif- 
fer with respect to conditions on the arities of type constructors for which there is 
a subtype relation. Beierle [3] assumes that the (constructor) order is only defined 
for type constants, i.e. constructors of arity 0. Thus we could have Int < Real, 
and so by extension List(lnt) < List(Real), but not List(lnt) < Tree(Real). 
Many authors assume that only constructors of the same arity are comparable. 
Thus we could have List(lnt) < Tree(Real), but not List(lnt) < Anylist. 
We assume, as [5], that if K/m < K' /m' , then m > m' . We think that this 
choice is crucial for the existence of principal types. 

Stroetmann and GlaB [16] argue that comparisons between constructors of 
arbitrary arity should be allowed in principle. Their formalism is such that the 
subtype relation does not automatically correspond to a subset relation. Never- 
theless, the formalism heavily relies on such a correspondence, although it is not 
said how it can be decided. We refer to [14] for more details. 

Technically, what is crucial for subject reduction is that substitutions are 
ordered', each variable is replaced with a term of a smaller type. In Section 4, we 
gave conditions under which unification of two terms yields an ordered substitu- 
tion: the unification is a matching, the term that is being instantiated is linear 
and is typed using a principal variable typing. The linearity requirement ensures 
that a principle variable typing exists and can be computed (Subsec. 4.2). 
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In Sec. 5, we showed how those conditions translate to conditions on the pro- 
gram and the execution model. We introduce modes and assume that programs 
are executed using moded unification (2). This might be explicitly enforced by 
the compiler, or it might be verified statically [2]. Moded unification can actually 
be very beneficial for efficiency, as witnessed by the language Mercury [15]. Apart 
from that, (3) nicely-modedness states the linearity of the terms being instan- 
tiated in a unification. Finally, (4) nicely-typedness states that the instantiated 
terms must be typed using a principal variable typing. 

Nicely-modedness has been widely used for verification purposes (e.g. [2]). In 
particular, the linearity condition on the output arguments is natural: it states 
that every piece of data has at most one producer. Input-linearity of clause heads 
however can sometimes be a demanding condition [13, Section 10.2]. 

Note that introducing modes into logic programming does not mean that 
logic programs become functional. The aspect of non-determinacy (possibility of 
computing several solutions for a query) remains. 

Even though our result on subject reduction means that it is possible to 
execute programs without maintaining the types at runtime, there are circum- 
stances where keeping the types at runtime is desirable, for example for memory 
management, printing, or in higher-order logic programming where the existence 
and shape of unifiers depends on the types [1 1] . 

There is a relationship between our notion of subtyping and transparency (see 
Subsec. 2.2). Transparency ensures that two terms of the same type have identical 
types in all corresponding subterms, e.g. if [1] and [x] are both of type List(lnt), 
we are sure that x is of type Int. Now in a certain way, allowing for a subtyp- 
ing relation that “forgets” parameters undermines transparency. For example, 
we can derive {x : String} h [x] = [1] Atom, since List(String) < Anylist 
and List(lnt) < Anylist, even though Int and String are incomparable. We 
compensate for this by requiring principal variable typings. A principal variable 
typing for [x] and Anylist contains {x : u^j, and so u’' can be instantiated to 
Int. Our intuition is that whenever this phenomenon (“forgetting” parameters) 
occurs, requiring principal variable typings is very demanding; but otherwise, 
subject reduction is likely to be violated. As a topic for future work, we want to 
substantiate this intuition by studying examples. 
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Abstract. We present a framework for decision making under uncertainty where 
the priorities of the alternatives can depend on the situation at hand. We design a 
logic-programming language, DOP-CLP, that allows the user to specify the static 
priority of each rule and to declare, dynamically, all the alternatives for the de- 
cisions that have to he made. In this paper we focus on a semantics that reflects 
all possible situations in which the decision maker takes the most rational, possi- 
bly probabilistic, decisions given the circumstances. Our model theory, which is a 
generalization of classical logic-programming model theory, captures uncertainty 
at the level of total Herbrand interpretations. We also demonstrate that DOP-CLPs 
can be used to formulate game theoretic concepts. 



1 Introduction 

Reasoning with priorities and reasoning under uncertainty play an important role in 
human behavior and knowledge representation. Recent research has been focused on 
either priorities, [14, 8, 6]', or uncertainty, [10, 9, 12, 1] and many others. 

We present a framework for decision making under uncertainty where the priorities 
of the alternatives depend on the different (probabilistic) situations. This way we obtain 
a semantics that reflects all possible situations in which the most rational (probabilis- 
tic) decisions are made, given the circumstances. The basic idea for the framework, a 
logic programming language called “Dynamically Ordered Probabilistic Choice Logic 
Programming” or DOP-CLP for short, incorporates the intuition behind both ordered 
logic programs ([8]) and choice logic programs ([4, 5]). The former models the abil- 
ity of humans to reason with defaults^ in a logic programming context, using a static 
ordering of the rules in the program. This works well, as long as probabilities stay out 
of the picture, but once they are present something extra is needed to express order. 
Take the famous “Tweety example” for instance: if you are sure that Tweety is indeed 
a penguin, you should derive that she cannot fly. But suppose you believe for only 30% 

* Wishes to thank the FWO for its support. 

* [8] uses the word order instead of priority. 

^ Intuitively, something is true by default unless there is evidence to the contrary. 
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that the bird you are holding is indeed a penguin. Is it then sensible to derive that she is 
a non-flying bird? 

By also taking into account the probabilities of the antecedents of the rules, in addition 
to their static order, we can overcome this problem. This leads to a dynamic ordering 
of rules, where the priority of a rule depends on the actual situation. 

We aim for a decision making framework that allows decisions to have possibly 
more than two alternatives, as in the case of ordered logic^. To accomplish this, we turn 
to a variant of Choice Logic Programs[4, 5], in which the possible alternatives for the 
decisions are described by choice rules. This approach has two nice side effects. First 
of all, there is not necessarily a partition of the Herbrand base: atoms can belong to 
more than one decision or to no decision at all. In the former case, there is a probability 
distribution over the various alternatives. In the latter case, an atom is either true or 
false, as in classical logic programming. The second advantage of our approach is that 
we allow a “lazy” evaluation of the alternatives which become active only when they 
are present in the head of an applicable choice rule. 

An interesting application of DOP-CLP is Game Theory. We provide a transforma- 
tion from strategic games to DOP-CLPs such that a one-to-one mapping is established 
between the mixed strategy Nash Equilibria of the game and the stable models of its 
corresponding DOP-CLP. 



2 Dynamically Ordered Probabilistic Choice Logic Programs 

In this paper, we identify a program with its grounded version, i.e. the set of all ground 
instances of its clauses. In addition we do not allow function symbols (i.e. we stick to 
datalog) so the number of literals is finite. 

Definition 1. A Dynamically Ordered Probabilistic Choice Logic Program, or DOP- 

CLP for short, is a finite set of rules of the form A B , where A and B are 
(possibly empty) sets of atoms and p G For a rule r € P, the set A is called the 
head, denoted Hr, while the set B is called the body of the rule r, denoted Br- The 
p € IR“'' denotes the priority this rule. A rule without a priority number has an infinite 
priority . We will denote the priority of rule r as p(r). The Herbrand base of P, denoted 
Bp, is the set of all atoms appearing in P. 

A rule H B can be read as: 

’’The occurrence of the events in B forces a probabilistic decision be- 
tween the elements h G H and supports each h with a priority p.” 

This means that rules with more than one head atom express no preference among the 
different alternatives they create. 

The priority of a rule r, p{r), indicates the maximal impact of the situation, de- 
scribed by the events of the body, on the preference of the head atom over the other 
alternatives. The dynamic priority of a rule, which we will define later, adjusts this im- 
pact according to the probability of the situation described in the body. By combining 



^ In ordered logic the two alternatives are represented using negation. 
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all the dynamic priorities of rules sharing a common head atom, we obtain an evalua- 
tion of the total impact on that atom which can then be used for comparison with other 
alternatives. 

Example 1 (Jakuza). A young member of the Jakuza, the Japanese Mafia, is faced with 
his first serious job. It will be his duty to control a new victim. Since it is his first job, he 
goes to his oyabun, head of his clan and mentor, for advice. He tells him that the Jakuza 
has three methods for controlling its victims: blackmail, intimidation and bribing. The 
victim can either give in immediately or can put up a stand. In the latter case, she does 
this by just ignoring the threats of the organization or she can threaten to go to the 
police. The oyabun is only able to give some information about previous encounters, 
which still needs to be interpreted in the current situation. So he starts telling about 
his previous successes. “Every time when I knew that the victim was willing to give 
in, I resorted to intimidation as this is the easiest technique and each time it worked 
perfectly. In case we would know that the victim was planning to stand up to us, we 
looked in to the possibility of bribing. Nine out of ten times, we were successful when 
we just offered enough money. If you are sure that the victim will run to the police from 
the moment that you approach him, you have to try to bribe her. Unfortunately this 
technique worked only 4 times out of 10. When your victim tries to ignore you, you 
should find something to blackmail her with. However, finding something interesting is 
not that easy as reflected by a success rate of 3 out of 10 times. 

So now it is up to you to make a good estimation of the victim’s reaction in order to 
succeed with your assignment.” 

All this information can easily be represented as the next DOP-CLP: 



jakuza ^ 


- 




blackmail 0 intimidate 0 bribe ^ 


_0 


jakuza 


stand — up ® give — in ^ 


_0 




ignore 0 police ^ 


_0 


stand — up 


intimidate ^ 


_10 


give — in 


bribe ^ 


_4 


police 


blackmail ^ 


_3 


ignore 


enough 0 more ^ 


_0 


stand — up 


bribe ^ 


_9 


enough 



An interpretation assigns a probability distribution over every state of affairs'*. 

Definition 2. Let P be a DOP-CLP. A (probabilistic) interpretation is a probability 
distribution! : 2 ®^ — > [ 0 .. 1 ] . 

In our examples, we will mention only the probabilities of those states that have a 
positive probability in the interpretation. 

* Each state corresponds to a total interpretation of the choice logic program obtained from P 
by omitting the priorities. Because we are working with total interpretations we only have to 
mention the positive part of the interpretation. 
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Example 2. Recall the Jakuza program of Example 1 . The following functions I, J and 
K are interpretations for this program^: 



I({j, hr, 


II 


J({J, 


in,p, 


e,s}) = 


3 

20 


II 


30 
■ 441 


I({j, hr, 


II 


J({J, 


in,p, 


m,g}) 


7 

10 


K{{j,bl,s,ig,m}) 


240 
“ 441 


I({j, hr, 


s,ig}) = g 


J({J, 


in,p, 


m,s}) 


2 

20 


II 


12 

441 


I({j, hr, 


II 


J({J, 


br,p, 


m,s}) 


_ 1 
20 


II 


96 

441 


I({j, hr, 


s,m}) = i 










'i<-{{j,bl,g,ig,e}) = 


5 

■ 441 














M{j,hl,g,ig,m}) ■ 


40 

441 














K({j, bl,g,p, e}) = 


2 

441 














K({j,6;,5,p,TO}) = 


16 
" 441 



Given an interpretation, we can compute the probability of a set of atoms, as the 
sum of the probabilities assigned to those situations which contain this set of atoms. 



Definition 3. Let I be a interpretation for a DOP-CLP P. The probability of set A C 
Bp, denoted is = '^acycBp • 

In choice logic programs, the basis of DOP-CLP, a rule is applicable when the body 
is true, and is applied when both the body and a single head atom are true. This situation 
becomes more tricky when probabilities come into play. Applicability is achieved when 
the body has a non-zero probability. In order for a rule to be applied it must be applica- 
ble. In addition, we demand that at least one head element has a chance of happening 
and that no two of them can happen simultaneously. 

Definition 4. Let I be a interpretation for a DOP-CLP P. 

1. A rule r G P is called applicable iff'&i{Br) > 0. 

2. An applicable rule r G P is applied ijf3a G Hr ■ di(a) > 0 and VS G 2^’’ with 
IS'I > 1 : MS) = 0 . 

We have been referring to alternatives of decisions without actually defining them. 
Two atoms are alternatives if they appear together in the head of an applicable choice 
rule. Alternatives are thus dynamic, since the applicability of the rules depends on the 
interpretation. 

Definition 5. Let I be a interpretation for a DOP-CLP P. 

- Two atoms a,b G Bp are alternatives wrt I iff 3 applicable r G P ■ {a, b} C Hr- 

- The set of all alternatives of an atom a G Bp wrt I is denoted fi\{a)~' . 

- A set D <G Bp is a maximal alternative set wrt I iffVa, b G D ■ a and b are alter- 
natives and Wc D ■ 3a G D ■ a and c are no alternatives. 

- Ai is the set of all maximal alternative sets wrt I . 

- An atom a G Bp is called single iff f2i{a) = a . 

^ For brevity, the names of the atoms are abbreviated. 

® When the set A contains just one element a we omit the brackets and write di(a). 

^ Notice that o £ l?i(a). The set i7i(a) \ {a}, denoted Of (a), is the set of all alternatives of a 
excluding itself. 
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A naive approach to defining a probability distribution is to insist that the sum of 
probabilities of the multiple elements in the head of a choice rule must be one. This 
approach fails in situations of the following kind: 

a © 6 © c . . . 

a © 6 . 



In this situation, the atom c would not stand a chance of obtaining a positive probability, 
although this might be the most favorable alternative. 

To overcome this problem, we introduced maximum alternative sets. They group all the 
atoms that have an alternative relation with each other. It is those sets that will be used 
for the probability distribution. In the next definition we call an interpretation total if 
it defines a probability distribution in which the probabilities of the elements of any 
maximal alternative set add up to one. Furthermore, for all decisions that need to be 
made, an alternative is selected for every possible outcome. 

Definition 6. A interpretation I for a DOP-CLP P is called total iff^D G Ai, 

- EaeD^l(^) = 

- VA C Bp such that r?i(A) > 0 • |A n Z?| = 1 . 



Example 3. Reconsider the lakuza program of Example 1 and the interpretations of 
Example 2. The interpretation I is not total. Indeed, consider the maximal alternative 
set{ig,p}.Wehave{ig,p}r\{br,j,s,m} = 0, while I({6r,j, s, to}) > 0, and r?i(zg) + 
= 5/6 + 1/12 1. The interpretations J and K are total. 

As we mentioned earlier, the dynamic priority of a rule adjusts the (static) prefer- 
ence of the rule to the probability that this situation might actually occur. It does this 
by giving the maximal contribution of the body atoms to the general preference of the 
head atoms. The dynamic priority of an atom is obtained by taking into account every 
real contribution of any situation that provides a choice for this atom. 

Definition 7. Let I be an interpretation for a DOP-CLP P. The dynamic priority of a 
rule r G Py denoted gi{r), equals gi{r) = p{r) * ■ 

The dynamic priority of an atom a G Bp, denoted gi{a), is gi(a) = £'i(^)- 

The dynamic priority will be used to determine which alternatives of a decision are 
eligible candidates and which ones are not. An atom is said to be blocked if there exists 
an alternative which has higher dynamic priority. Preferred atoms are those that block 
every other alternative. The competitors of an atom are those alternatives which are not 
blocked by this atom. Their dynamic priority is thus at least as high as that of the atom. 

Definition 8. Let I be an interpretation for a DOP-CLP P. An atom a G Bp is blocks 
bybG Of (a) w.r.t. I iff gi{b) > gi{a) . 

An atom a G Bp is called preferred in I iff\/b G Of {a) ■ gi(a) > gi(b) . 

The atom a is a competitor of the atom b G Of (a) w.r.t. I ifb does not block a w.r.t 1. 
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In standard logic programming an interpretation is a model if every rule is either 
not applicable or applied. When priorities are involved, in order for an interpretation to 
become a model, it must be possible to assign a zero-probability to atoms which have a 
more favorable alternative with non-zero probability. 

Definition 9. Let I be an interpretation for the DOP-CLP P. I is a model for P iff 
Vr G P: 

- di{Br) = 0, i.e. r is not applicable, or 

- r is applied, or 

- Va G Hr ■ 36 competitor of aw.r.t.l ■ i9i(b) > 0 . 

Example 4. Consider again the Jakuza program of Example 1 and its interpretations of 
Example 2. The interpretation I is not a model, since the rule blackmail ignore 
does not satisfy any of the above conditions. This rule is applicable, since d\{ignore) = 
5/6; it is not applied, since ’difblackmail) = 0; it does not have any competitors since 
Qi{blackmail) = 5/2 while gi{intimidate) = 0 and gi(bribe) = 29/12. The interpre- 
tations J and K are both models. 

Proposition!. Let P be a DOP-CLP and let I be a model for it. If a G Bp is a 
preferred atom then 3r G P : a G Hr ■ r is applied . 

In some cases, atoms receive a probability which they actually do not deserve. This 
happens when there is some better qualified alternative (i.e., an alternative that has 
a higher dynamic priority) that should obtain this probability. Such atoms are called 
assumptions, since they were just ’’assumed” to have a chance of happening. 

Definition 10. Let I be an interpretation for a DOP-CLP P. An atom a G Bp is 
called an assumption w.r.t. I iffdi{a) > 0 when either a is blocked or a is single and 
Pi(o) = 0. 1 is assumption-free iff it contains no assumptions. 

Example 5. Consider once more the Jakuza program of Example 1 and its interpreta- 
tions in Example 2. The interpretation J is not assumption-free, as {intimidate) > 0 
and the alternative bribe blocks intimidate, since gj{bribe) = 4 21 j2Q > 3.5 = 

gj (intimidate). Intuitively, because bribing is more successful than intimidation, in- 
timidation should not be considered at all. The interpretation K is assumption-free. 

Proposition 2. Let P be a DOP-CLP and let Ibe a total assumption-free interpretation 
for P. If a G Bp is a preferred atom then 3r G P : a G Hr • r is applied and ’di(a) = 
1 . 



Interpretations evaluate the likelihood of every possible outcome, by assigning a 
probability distribution to every situation. These probabilities are influenced by the 
atoms which are present in each situation. In order to quantify this influence one must 
know whether the events which occur in any such interpretation are independent of 
each other. An interpretation which assumes that there is no inter-dependence between 
atoms, is said to be “independent”, as follows: 
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Definition 11. Let I be an interpretation for a DOP-CLP P. Wfe say that I is indepen- 
dent iffy A C Bp: 

- t?i(A) = 0, or 

- VL> G Z\i s.t |L> n ^1 < 1 • t?i(A) = riaeA ^i(“) ■ 

Example 6. Consider the interpretations J and K of Example 2. The interpretation J 
is not independent as t?j({y, in,p, e, s}) = 3/20 t^j(y) * * t?j(p) * t?j(e) * 

t?j(s) = 1 * 19/20 * 1 * 3/20 * 8/20. The interpretation K is independent. 



Definition 12. Let P be a DOP-CLP. A total independent assumption-free model is 
said to be stable. A stable model is crisp if it assigns probability one to a single subset 
of the Herb rand base. 



Example 7. For the last time we return to the Jakuza example and its three interpreta- 
tions I,J and K from Example 2. Combining the results from Examples 3, 4, 5 and 6, 
we can conclude that K is the only stable model of the three. 

A stable model for the Jakuza example represents a rational choice where the proba- 
bility of the action is consistent with the estimates on the victim’s reactions. In general, 
stable models reveal all possible situations in which the decisions are made rationally, 
considering the likelihood of the events that would force such decisions. 

3 An Application of DOP-CLPs: Equilibria of Strategic Games 

3.1 Strategic Games 

A strategic game models a situation where several agents (called players) independently 
choose which action they should take, out of a limited set of possibilities. The result of 
the actions is determined by the combined effect of the choices made by each of the 
players. Players have a preference for certain outcomes over others. Often, preferences 
are modeled indirectly using the concept of payoff whsm players are assumed to prefer 
outcomes where they receive a higher payoff. 

Example 8 (Bach or Stravinsky). Two people wish to go out together to a music concert. 
They have a choice between a Bach or Stravinsky concert. Their main concern is to be 
together, but one person prefers Bach and the other prefers Stravinsky. If they both 
choose Bach then the person who preferred Bach gets a payoff of 2 and the other a 
payoff of 1. If both go for Stravinsky, it is the other way around. If they pick different 
concerts, they both get a payoff of zero. 

The game is represented in Fig. 1. One player’s actions are identified with the rows and 
the other player’s with the columns. The two numbers in the box formed by row r and 
column c are the players’ payoffs when the row player chooses r and the column player 
chooses c. The first of the two numbers is the payoff of the row player. 
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Bach 
Stravinsky 

Fig. 1. Bach or Stravinsky (BoS) 



Bach Stravinsky 



2,1 


0,0 


0,0 


1,2 



Definition 13 ([11]). A strategic game is a tuple {N, (Ai), (m)) where 

— N is a finite set of players; 

— for each player i G N, Ai is a nonempty set of actions that are available to her ^ 
and. 

— for each player i G N, Ui : A = Xj^^Aj —>■ W is a utility function which 
describes the players ’ preferences. 

An element b.G Ais called a profile. For a profile a we use to denote the component 
of a. in Ai. For any player i G N, we define A-i = Similarly, an element 

of A-i will often be denoted as a_j. For any a_j G A-i and ai G Ai, (a_j, af) is the 
profile a' G A in which a! i = Oi and a! j = aj for all j f i. 

A game {N, (Af, (ui)) is played when each player i G N selects a single action 
from the set Ai of actions available to her. Since players are thought to be rational, it is 
assumed that a player will select an action that, to the best of her knowledge, leads to 
a “preferred” profile. Of course, this is limited by the fact that she must decide without 
knowing what the other players will choose. 

The notion of Nash equilibrium shows that, in many cases, it is possible to limit the 
possible outcomes (profiles) of the game. 

Definition 14 ([11]). A Nash equilibrium of a strategic game {N, (Af, {m)) is a pro- 
file a* satisfying Wai G A* • (alj,a*) >* (al^.ai) . 

Intuitively, a profile a* is a Nash equilibrium if no player can unilaterally improve 
upon his choice. This means that, given the other players’ actions al^, a* is the best 
player i can do®. 

Although the notion of Nash equilibrium is intuitive, it does not provide a solution 
to every game. Take for example the Matching Pennies game. 

Example 9 (Matching Pennies). Two people toss a coin. Each of them has to choose 
head or tail. If the choices differ, person 1 pays person 2 a Euro; if they are the same, 
person 2 pays person 1 a Euro. Each person cares only about the amount of money that 
she receives. The game modeling this situation is depicted in Fig. 2. This game does not 
have a Nash equilibrium. 

The intuitive strategy to choose head or tail with 50% frequency each (yielding a 
profit in 25% of the cases) corresponds with a mixed strategy Nash equilibrium where 
agents assign a probability distribution over their actions. 



* We assume that Ai C\ Aj — % whenever i f j. 

® Note that the actions of the other players are not known to i. 
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Head 

Tail 



Head Tail 



1,0 


0,1 


0,1 


1,0 



Fig. 2. Matching Pennies (Example 9). 



Definition 15 ([11]). The mixed extension of the strategic game {N, {Ai), {ui)) is the 
strategic game {N, (A(Ai)), (Ui)) in which A(Ai) is the set of probability distribu- 
tions over Ai, and Ui : Xj^]\[A(A)j —>■ K assigns to each a G Xj^f^[A(A)j the 
expected value under Ui of the lottery over A that is induced by a (so that Ui (a) = 
EaeAdlyeiv aj(a))w*(a) if A is finite). 

Note that Ufia) = Ea gq cti{ai)Ui(a-i, e(ai)) , for any mixed strategy profile 
a, where e(ai) is the degenerate mixed strategy of player i that attaches probability one 
to Oi G Ai. This because we are working with finite sets of actions (e.g. Ai). 

Definition 16 (Mixed Strategy Nash Equilibrium). A mixed strategy Nash equili- 
brium of a strategic game is a Nash equilibrium of its mixed extension. 

Example 10. Although the matching pennies game (Example 9) does not have a Nash 
equilibrium, it has the single mixed strategy Nash equilibrium {{Head : 1/2, Tail : 
l/2{, {Head : 1/2, Tail : 1/2}}, which corresponds to how humans would rea- 
son. Apart from its two Nash equilibria, the Bach and Stravinsky game (Example 8) 
also has the extra mixed strategy Nash equilibrium {{Bach : 2/3, Stravinsky : 
1/3}, {Bach : 1/3, Stravinsky : 2/3}} . 

Each strategic game has at least one mixed strategy Nash equilibrium. Furthermore, 
each Nash equilibrium is also a mixed strategy Nash equilibrium and every crisp mixed 
strategy Nash equilibrium (where all the probabilities are either 0 or 1) responds to a 
Nash equilibrium. 

3.2 Transforming Strategic Games to DOP-CLPs 

In this subsection, we combine propose an intuitive transformation from strategic games 
to DOP-CLPs such that the stable models of the former correspond with the mixed 
strategy Nash equilibria of the latter. 

Definition 17. Let (N, (Ai), (ui)} be a strategic game. The corresponding DOP-CLP 
P equals P = {Ai Vi G N} U {ai a_^ | a G A, Vi G N} . 

The corresponding DOP-CLP contains two types of rules. First, there are the real 
choice rules which represent, for each player, the actions she can choose from. The 
zero priority assures that the choice itself does not contribute to the decision making 
process. Rules of the second type represent all the decisions a player can make (the 
heads) according to the situations that the other players can create (the bodies). A rule’s 
priority corresponds with the payoff that the deciding player would receive for the pure 
strategy profile corresponding to the head and body of the rule. 
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Example 11. The Bach and Stravinsky game (Example 8) can be mapped to the DOP- 
CLP P: 



S2 



(>1 0 Si 
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stable models: 






) = 2/9 


12(61, &2) = 1 




13(61,62) =0 


= 4/9 


12(61, S2) = 0 




13(61, S2) = 0 


) = l /9 


l2(si, 62) = 0 




l3(si, 62) = 0 


) = 2/9 


l2(si, S2) = 0 




l3(si, S2) = 1 



In this example, the probabilities of the actions correspond with the one given for mixed 
strategy Nash equilibria. The following theorem demonstrates that this is generally true. 



Theorem 1. Let {N, (Ai), (ui)) be a strategic game and let P be its corresponding 
DOP-CLP and let I and a* be respectively an interpretation for P and a mixed strategy 
profile for {N, {Af), (ui)) such that Va € A Vi € N, ai(a) = Vi(ai) . Then, I is a 
stable model iff a* is a mixed strategy Nash equilibrium. 



4 Relationships to Other Approaches 

4.1 Logic Programming 

It is easy to see that positive logic programs are a subclass of the dynamically ordered 
choice logic programs, and that the stable models for both systems coincide. All nec- 
essary properties follow immediately from the way we handle single atoms. With the 
current semantics it is impossible to have a mapping between the stable models of a 
choice logic program ([4]) and the crisp stable models of the corresponding DOP-CLP. 
Indeed, our system is more credulous, since it allows a pure choice (probability 1) when 
two alternatives are equally preferred. However, we have that every stable model of a 
CLP is also a crisp stable model of the corresponding DOP-CLP. 

4.2 Priorities 

The logic programming language using priorities that corresponds best to our approach 
is dynamically ordered choice logic programming (OCLP) introduced in [6]. Although 
OCLP does not work with probabilities, these two systems have a common approach 
to and a similar notion of alternatives, in the sense that alternatives appear in the head 
of an applicable choice rule. OCLP also requires that this choice rule has a higher pri- 
ority than the rule for which one computes the head atoms’ alternatives. So, the main 
difference with our approach is the way that OCLP uses priority to create alternatives. 
Ordered logic programs ([8]) can easily be transformed to DOP-CLPs in such a way 
that the credulous stable models of the former correspond with the crisp stable models 
of the latter. For the same reason that we mentioned for CLPs, it is not yet possible 
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to represent the skeptical stable model semantics for ordered logic programs. In [3], 
preference in extensive disjunctive logic programming is considered. As far as overri- 
ding is concerned the technique corresponds rather well with skeptical defeating of [6], 
but alternatives are fixed as an atom and its (classical) negation. Dynamic preference 
in extended logic programs is introduced in [2] in order to obtain a better suited well- 
founded semantics. Preferences/priorities are incorporated here as rules in the program. 
While alternatives make our system dynamic, [2] introduces the dynamics via a stabil- 
ity criterion that overrules preference information but the alternatives remain static. A 
totally different approach is proposed in [14]. Here the preferences are defined between 
atoms without references to the program. After defining models in the usual way, one 
then uses preferences to filter out the less preferred models. 

4.3 Uncertainty 

A lot of researchers [1, 10, 9, 12, 13] have tackled the problem of bringing probabili- 
ties into logic programming. The probabilities used can be divided into two categories 
depending on the type of knowledge symbolized: statistical or belief. [1] concentrates 
on the first type while [10, 12, 13] are more interested in the latter. [9] is one of the few 
that is able to handle both types. Our formalism focuses mainly on knowledge of belief 
although it is possible to use statistical knowledge for defining the static priorities. An 
other difference between the various systems is the way they introduce probabilities 
and handle conjunctions. For example, [9] works with probability intervals and then 
uses the rules of probability to compute the probability of formulae. In this respect, we 
adopt the possible world/model theory of [10, 12]. However, we introduce probabilities 
at the level of interpretations, while they hard-code the alternatives by means of disjoint 
declarations together with probabilities, and the other atoms are computed by means of 
the minimal models of the logic program. 

4.4 Games and Logic Programming 

The logical foundations of game theory have been studied for a long time in epistemic 
logic. Only recently, researchers have become interested in the relationships between 
game theory and logic programming. The first to do so was [7]. It was shown that n- 
person games or coalition games can be transformed into an argumentation framework 
such that the NM-solutions of the game correspond with the stable extensions of the 
corresponding argumentation framework. [7] illustrated also that every argumentation 
framework can be transformed into a logic program such that the stable extensions of 
the former coincide with the stable models of the latter. In [4] it was demonstrated that 
each strategic game could be transformed into a CLP such that the Nash equilibria of 
the former correspond with the stable models of the latter. [6] shows that OCLPs can 
be used for an elegant representation of extensive games with perfect information such 
that, depending on the transformation, either the Nash or the subgame perfect equili- 
bria of the game correspond with the stable models of the program. Concerning mixed 
strategy Nash equilibria of strategic games, the approach which is the most related to 
ours is the Independent Choice Logic of [13]. [13] uses (acyclic) logic programs to de- 
terministically model the consequences of choices made by agents. Since choices are 
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external to the logic program, [13] restricts the programs further, not only to he deter- 
ministic (i.e. each choice leads to a unique stable model) but also to be independent, in 
the sense that literals representing alternatives may not influence each other, e.g. they 
may not appear in the head of rules. ICL is further extended to reconstruct much of 
classical game theory and other related fields. The main difference with our approach 
is that we do not go outside of the realm of logic programming to recover the notion of 
equilibrium. The basis of his formalism does not contain probabilities but works with 
selector functions over the hypotheses and then works with the (unique) stable model 
that comes from the program itself. This way one creates a possible world semantics. 
Our transformation makes sure that every atom is an alternative of a choice/decision for 
which a probability can be computed. 
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Abstract. We describe some new, simple and apparently general meth- 
ods for designing FPT algorithms, and illustrate how these can be used to 
obtain a significantly improved FPT algorithm for the Maximum Leaf 
Spanning Tree problem. Furthermore, we sketch how the methods can 
be applied to a number of other well-known problems, including the para- 
metric dual of Dominating Set (also known as Nonblogker), Ma- 
trix Domination, Edge Dominating Set, and Feedbagk Vertex 
Set for Undirected Graphs. The main payoffs of these new methods 
are in improved functions f{k) in the FPT running times, and in general 
systematic approaches that seem to apply to a wide variety of problems. 



1 Introduction 

The investigations on which we report here are carried out in the framework 
of parameterized complexity, so we will begin by making a few general remarks 
about this context of our research. The subject is concretely motivated by an 
abundance of natural examples of two different kinds of complexity behaviour. 
These include the well-known problems Min Cut Linear Arrangement, 
Bandwidth, Vertex Cover, and Minimum Dominating Set (for defini- 
tions the reader may refer to [GJ79]). 

All four of these problems are AP-complete, an outcome that is now so rou- 
tine that we are almost never surprised. In the classical complexity framework 
that pits polynomial-time solvability against the ubiquitous phenomena of NP- 
hardness, they are therefore indistinguishable. All four of these decision problems 
take as input a pair consisting of a graph G and a positive integer k. The positive 
integer k is the natural parameter for all four problems, although one might also 
wish to consider eventually other problem parameterizations, such as treewidth. 
We have the following contrasting facts: 
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1. Min Cut Linear Arrangement and Vertex Cover are solvable in lin- 
ear time for any fixed k. 

2. The best known algorithms for Bandwidth and Minimum Dominating 
Set are respectively 0{n^) and 

In fact, we now have very strong evidence, in the framework of parameterized 
complexity, that probably Bandwidth and Minimum Dominating Set do not 
admit algorithms of the qualitative type that Min Cut Linear Arrangement 
and Vertex Cover have. 

1.1 What Is the Nature of This Evidence? 

We offer a new view of the study of computability and of its sequel concerns 
with efficient computability. We motivate how this naturally divides into three 
main zones of discussion, anchored by variations on the Halting Problem. 

In the first zone, we have the unsolvability of the Halting Problem, the unsolv- 
ability of other problems following by recursive reductions, and Godel’s Theorem 
as a corollary. 

In the second zone, we have “classical complexity” where the reference prob- 
lem is the Halting Problem for Nondeterministic P-time Turing Ma- 
chines. This problem is trivially AP-complete and essentially defines the class 
NP. Another way to look at this problem is as a generic embodiment of com- 
putation that is potentially exponential. If at each nondeterministic step there 
were two possible choices of transition, then it is a reasonable conjecture that, 
in general, we will not be able to analyze Turing machines of size n for the 
possibility of halting in n steps in much less than the 0(2”) time.^ 

The third zone of negotiation with intractability is anchored by the fc-STEP 
Halting Problem for Nondeterministic Turing Machines (fc-NDTM), 
where in this case we mean Turing machines with an unrestricted alphabet size, 
and with unrestricted nondeterminism at each step. This is a generic embod- 
iment of computational complexity. For the same reasons as in the second 
zone of negotiation, we would not expect any method of solving this problem 
that greatly improves on exhaustively exploring the n-branching depth-A: tree of 
possible computation paths (for a Turing machine of size n). 

This leads to the following three basic definitions of the parameterized complex- 
ity framework. 



Definition 1. A parameterized language is a subset L C E* x E* . For nota- 
tional convenience, and without any loss of generality, we can also consider that 
LCE*x IN. 

^ It would take 0(2”) time to exhaustively explore the possible computation paths 
— because nondeterministic Turing machines are so unstructured and opaque, i.e., 
such a generic embodiment of exponential possibility. 
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Definition 2. A parameterized language L is fixed-parameter tractable (FPT) if 
there is an algorithm to determine if (x, k) G L in time f{k) + n'^ where |a;| = n, 
c is a constant, and f is a function (unrestricted). 



Definition 3. A parameterized language L is many:l parametrically reducible 
to a parameterized language L' if there is an FPT algorithm that transforms 
{x,k) into {x' ,k') so that: 

1. (x,k) € L if and only if {x' , k') € L' , and 

2. k' < g{k) (where g is an unrestricted function; k' is purely a function of k) 

The analog of NP in the third zone of dicussion is the parameterized complexity 
class W[l] [DF95b]. That the /c-NDTM problem is complete for W[l] was proven 
by Cai, Chen, Downey and Fellows in [CCDF97]. Since Bandwidth and Dom- 
inating Set are hard for VF[1], we thus have strong natural evidence that they 
are not fixed-parameter tractable, as Vertex Cover and Min Cut Linear 
Arrangement are.^ 

1.2 What Is the Current Status of This “Third Zone” of Discussion? 

As an example of how useful FPT algorithms can be, we know that Vertex 
Cover can be solved (after several rounds of improvement) in time 0((1.27)^ -I- 
n) for graphs of size n [CKJ99]. The problem is thus well-solved for fc < 100 ana- 
lytically. In practice, this worst-case analytical bound appears to be pessimistic. 
The current best FPT algorithms for this problem (which deliver an optimal 
solution if they terminate) appear to have completely solved the problem for 
input graphs in toto, so long as the parameter value is A: < 200. These new effec- 
tive algorithms for small ranges of k have applications in computational biology 
[Ste99]. 

Another promissing result is the fixed-parameter-tractable algorithm for 3- 
Hitting-Set presented in [NROOb]. The running time is 0(2.270^ -I- n) where 
k is the size of the hitting set to determine and n denotes the length of the 
encoding of the input. As Vertex Cover 3-Hitting Set has applications in 
computational biology. 

The Biologist, knowing a bit about algorithms, asks for an algorithm that is 
“like sorting”, i.e., a polynomial-time algorithm for her problem. Working with 
an AP-hard problem, if the fixed-parameter for the problem to solve is rather 
small (e.g.. Protein Folding involves as the (natural) parameter the number 

^ Further background on parameterized complexity can be found in [DF98]. We re- 
mark in passing that if the fundamental mission of theoretical computer science is 
conceived of as empirical and explanatory, like theoretical physics, then a two- (or 
more-) dimensional theoretical framework might well be more suitable than the one- 
dimensional framework inherited from recursion theory, to the task of explaining 
the crucial differences in intrinsic problem complexity encountered in natural com- 
putational practice, even if the explanatory framework involves phenomenologically 
“unequal” dimensions — a situation frequently encountered in physics. 
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of adjacencies between hydrophobic constituents of the protein sequence, that 
is the parameter is less than 100 for interesting applications) the knowledgeable 
biologist will henceforth ask, “Can I get an algorithm like Vertex Cover?” 
There is no way to answer this question without taking the discussion into the 
third natural zone. 

We remark, that the parameterize complexity of Protein Folding as well as 
Toplogical Containment for Graphs and Directed Feedback Vertex Set is 
still open. All three problems are conjectured to be in FPT. For AP-completeness 
of Protein Folding and Toplogical Containment for Graphs we refer 
to [CGPPY98, BL98] and [DF98], respectively. AP-completeness of Directed 
Feedback Vertex Set is shown in [GJ79]. 

1.3 The Substantial Open Question About Parameterized 
Complexity 

Despite the fact that logically, in some sense, AP-completeness can now rea- 
sonably be considered a rather unimportant issue for problems that are, when 
naturally parameterized, fixed-parameter tractable, and for which the main ap- 
plications are covered by small or moderate parameter ranges. From a practical 
point of view there is still an important unresolved question that motivates our 
work in this paper: 

What are typical functions /(fc) for problems in FPT? 

We make two contributions. 

— We substantially improve the best known FPT algorithm for the Max Leaf 
Spanning Tree problem. The best previous algorithm due to Downey and 
Fellows in [DF95a, DF98] has a running time of 0(n-|-(2fc)^^). Our algorithm 
runs in time 0(n-|- {k-\- 1)(14.23)^). In the concluding section we discuss the 
fine-grained significance of this improvement. 

— We introduce new methods that appear to be widely useful in designing 
improved FPT algorithms. The first new method is that of coordinatized 
kernelization arguments for establishing problem kernelizations. The second 
new method, catalytic reduction, employs a small amount of partial informa- 
tion about potential solutions to guide the efficient development of a search 
tree. 

2 Prototype: An Improved FPT Algorithm for the Max 
Leaf Spanning Tree Problem 

The history of the problem and some recent complexity developments can be 
found in [D74, GJ79, GMM94, GMM97, LR98]. An interesting application of 
the problem is described in [KKRUW95]. The flagship problem for the new 
techniques we introduce here is defined as follows. 
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Max Leaf Spanning Tree 

Input: A graph G and a positive integer k. 

Parameter: k 

Question: Does G have a spanning tree with at least k leaves? 



One of the remarkably nice properties of FPT is that the following is an equiv- 
alent definition of the tractable class of parameterized problems [DFS99]. 

Definition 4. A parameterized language L is in FPT if and only if there is: 

1. A function g{k). 

2. A 2-variahle polynomial q(ji,k). 

3. A many:! parametric reduction (I of L to itself requiring time at most 
q{n,k), that transforms an instance {x,k), where \x\ = n, to an instance 
{x',k') with \x'\ < g{k) and k' < k, so that {x,k) G L if and only if 
(x', k') G L. 

In other words, a problem with parameter k is in FPT if and only if an input to 
the problem can be reduced in ordinary polynomial time to an equivalent input 
whose size is bounded by a function (only) of the parameter. For most problems 
in FPT, moreover, a natural set of reduction rules are known that accomplish 
the transformation ^ by a series of “local simplifications” . This process is termed 
kernelization in the terminology of [DF95a]^ and, currently, the main practical 
methods of FPT-algorithm design are based on kernelization and the method of 
hounded search trees. 

The idea of kernelization is relatively simple and can be quickly illustrated 
for the Vertex Cover problem. If the instance is (G, k) and G has a pendant 
vertex v of degree I connected to the vertex u, then it would be silly to include 
V in any solution (it would be better, and equally necessary, to include u), so 
(G,k) can be reduced to (G',fc — 1), where G' is obtained from G by deleting 
u and V. Some more complicated and much less obvious reduction rules for the 
Vertex Cover problem can be found in the current state-of-the-art FPT algo- 
rithms (see [BFR98, DFS99, CKJ99, NR99b, Ste99]). The basic schema of this 
method of FPT algorithm design is that reduction rules are applied until an 
irreducible instance {G' ,k') is obtained. At this point in the FPT algorithm, a 
Kernelization Lemma is invoked to decide all those instances where the reduced 
instance G' is larger than g{k') for some function g. For example, in the cases 
of Vertex Cover and Planar Dominating Set, if a reduced graph is large 
then (G'j k') is a no-instance for a suitable linear function g. In the case of Max 
Leaf Spanning Tree and Nonblocker, large reduced instances are automat- 
ically yes-instances. 



® These natural kernelization algorithms have significant applications in the design of 
heuristics for hard problems, since they are a reasonable preprocessing step for any 
algorithmic attack on an intractable problem [DFS99]. 
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In first phase of our algorithm a set of reduction rules transforms an instance 
(G, k) of Max Leaf Spanning Tree to another instance (G', k') where k' < k 
and \G'\ < 5.75k. By exploring all /c-subsets of the problem kernel G' , this 
immediately implies an FPT algorithm with running time 0{n + = 

0(n+/c(33.1)^). But we will do substantially better than that, namely we present 
an algorithm running in time 0(n + {k + 1)(14.23)^). 

Our algorithm has three phases: 

Phase 1: Reduction to a problem kernel of size 5.75k. 

Phase 2: The introduction of catalytic vertices. 

Phase 3: A search tree based on catalytic branching (section 2.2) and coordi- 
natized reduction (section 2.1). 

Our algorithm is actually based on a slight variation on Max Leaf Spanning 
Tree defined as follows. 

Catalytic Max Leaf Spanning Tree 

Input: A graph G = {V, E) with a distinguished catalytic vertex t G V, k G 
Parameter: k 

Question: Does G have a spanning tree T having at least k leaves, such that t 
is an internal vertex of T? 



In the following subsection we prove that this variant of the originl problem has 
a kernel of size 5.75k. Because the reduction rules used are almost identical to 
the reduction rules used in the proof that Max Leaf Spanning Tree has a 
kernel of size 5.75k, we will concentrate on the kernelization of Catalytig Max 
Leaf Spanning Tree only. 

2.1 The Kernelization Lemma and the Method of Coordinatized 
Kernels 

How does one proceed to discover an adequate set of reduction rules, or elucidate 
(and prove) a bounding function g{k) that insures for instances larger than this 
bound, that the question can be answered simply? 

The technique of coordinatized kernels is aimed at these difficulties, and we 
will illustrate it by example with the Max Leaf Spanning Tree problem. We 
seek a Lemma of the following form: 

Lemma 1. If (G = (V,E),k) is a reduced instance o/ Catalytig Max Leaf 
Spanning Tree with catalytic vertex t G V , and G has more than g{k) vertices, 
then (G, k) with catalytic vertex t is a yes-instance. 

Proof. Suppose that: 

(1) G has more than g{k) vertices. (We will eventually determine g{k), cf. page 
247.) 
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(2) G is connected and reduced. (As we make the argument, we will see how to 
define the reduction rules.) 

(3) G is a yes-instance for k, witnessed by a subtree T (with t internal; not 
necessarily spanning) having k leaves. 

(4) G is a no-instance for fc -|- 1 . 

(5) Among all such G satisfying (1-4), the witnessing tree T has a minimum 
possible number of vertices. 

(6) Among all such G and T satisfying (1-5), the quantity d{t,l) is min- 
imized, where L is the set of leaves of T and d{t, 1) is the distance in T to 
the root vertex t. 

Then we argue for a contradiction. 

Comment. The point of all this is to set up a framework for argument that will 
allow us to see what reduction rules are needed, and what g{k) can be achieved. 
In essence we are setting up a (possibly elaborate, in the spirit of extremal 
graph theory) argument by minimum counterexample — and using this as a 
discovery process for the FPT algorithm design. Condition (3) gives us a way of 
“coordinatizing” the situation by giving us the structure of a solution to refer 
to (how this is used will become clear as we proceed). 

Since G is connected, any tree subgraph T of G with k leaves extends to a 
spanning tree with k leaves. This witnessing subgraph given by condition (3) 
is minimized by condition (5). Refer to the vertices of V — T as outsiders. The 
following claims are easily established. The first five claims are enforced by con- 
dition (4). 

Claim 1: No outsider is adjacent to an internal vertex of T . 

Claim 2: No leaf of T can be adjacent to two outsiders. 

Claim 3: No outsider has three or more outsider neighbors. 

Claim 4- No outsider with 2 outsider neighbors is connected to a leaf of T. 
Claim 5: The graph induced by the outsider vertices has no cycles. 

It follows from Claims (1-5) that the subgraph induced by the outsiders consists 
of a collection of paths, where the internal vertices of the paths have degree 2 in 
G. Since we are ultimately attempting to bound the size of G, this suggests (as 
a discovery process) the following reduction rule for kernelization. 

Kernelization Rule 1: If (G,k) has two adjacent vertices u and v of degree 2, 
neither of which is the catalyst t, then: 

(Rule 1.1) If uu is a bridge, then contract uv to obtain G' and let k' = k. 
(Rule 1.2) If uv is not a bridge, then delete the edge uv to obtain G' and 
let k' = k. 

The soundness of this reduction rule is not completely obvious, although not 
difficult. Having now partly clarified condition (2), we can continue the argument. 
The components of the subgraph induced by the outsiders must consist of paths 
having either 1,2 or 3 vertices. 

The first possibility leads to another reduction rule which eliminates pendant 
vertices. This leads to a situation where the only possibilities for a component 
G of the outsider graph are: 
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1 . The component C consists of a single vertex and C has at least 2 leaf neigh- 
bors in T. 

2. The component C consists of two vertices, and C has at least 3 leaf neighbors 
in T. 

3. The component C has three vertices, and has at least four leaf neighbors in 

T. 

The weakest of the population ratios for our purposes in bounding the kernel 
size is given by case (3). We can conclude, using Claim 2, that the number of 
outsiders is bounded by 3/c/4. 

The next step is to study the tree T. Since it has k leaves it has at most k — 2 
branch vertices. Using conditions (5) and (6), it is not hard to see that: 

1. Any path in T between a leaf and its parental branch vertex has no subdi- 
visions. 

2. Any other path in T between branch vertices has at most 3 subdivisions 
(with respect to T). 

Consequently T has at most 5k vertices, unless there is a contradiction. This 
yields our g(k) of 5.75k. We believe that this bound can be improved by a more 
detailed structural analysis in this same framework. 

2.2 Catalytic Reduction in Search Tree Branching 

The catalytic branching technique is described as follows. Let c = 5.75 for con- 
venience. Assume that any instance G for parameter k can be reduced in linear 
time to an instance G' of size at most ck. Suppose we are considering an instance 
(G, k) with catalytic vertex t. We can assume that G is connected. Consider a 
neighbor m of t in G. 

Catalytic Branching. We have the following basic branching procedure: (G, k) 
with catalytic vertex t is a ^/es-instance if and only if one of the following two 
branch instances is a ^/es-instance. The first branch is developed on the assump- 
tion that u is also an internal vertex of a /c-leaf spanning tree T (for which t is 
internal) . The second branch is developed on the assumption that m is a leaf for 
such a tree T. 

First Branch: Here we have (G', k), where G' is obtained from G by contracting 
the edge between t and u. The resulting combined vertex is the catalytic vertex 
for G'. 

Second Branch: Here we begin with (G', k— 1), where G' is obtained by deleting 
u. But now, since the parameter has been decreased, we may re-kernelize so that 
the resulting graph has size at most c{k — 1). Depending on the size of G', the 
size of the instance that we reduce to on this branch is somewhere between n—1 
and n — c — 1, when G has size n, in the worst case. 

The key to the efficiency of this technique is in the re-kernelization on the 
second branch. Because the amount of re-kernelization varies, this leads to a 
somewhat complicated recurrence. Our bound on the running time is based on 
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a simpler recurrence that provides an upper bound that is probably not partic- 
ularly tight. 

We have thus described Phase 3 of our algorithm. We must still describe Phase 2. 

Introducing Catalytic Vertices. A simple way to accomplish this task is to 
simply choose a set of fc-|-l vertices in G. If (G, k) is a yes-instance for Max Leaf 
Spanning Tree then one of these vertices can be assumed to be an internal 
vertex of a solution fc-leaf spanning tree. The k+1 branches must all be explored. 

2.3 Analysis of the Running Time 

We define an abstract value v{n, k) for a node (G, k) in the search tree, where 
G is a graph on n vertices. Choosing an appropriate abstract weighting w (by 
computational experiment) for the parameter k, in order to capture some of the 
information about the efficiency of catalytic branching, we define v(n, k) = 8k+n 
(that is, w = 8 seems to work best for our current kernelization bound). Our 
kernelization bound of n < 5.75k means that we require an upper bound on 
the size of a search tree with root value v{n, k) of at most 13.75/c. The catalytic 
branching gives the recurrence 

f{v) < f{v- 1) + f{v-9) 

which yields a positive real root of a = 1.2132. Evaluating at u = 13.75fc, 
and noting that the nodes of the search tree require 0{k) time to process, we 
immediately obtain a parameter function of /c(14.23)^ (for each of the A: -I- 1 
search trees initiated in Phase 2 by the introduction of a catalytic vertex). By 
the speedup technique of Niedermeier and Rossmanith [NROOa], we get a running 
time of 0{n + {k + 1)(14.23)^) for our algorithm. 

3 Catalytic Branching as General FPT Technique 

The catalytic branching strategy can easily be adapted to a number of other 
FPT problems. The following are some sketches of further applications. (Note, 
however, that to make use of catalytic branching, it is first necessary to prove a 
kernelization procedure that respects the presence of a catalytic vertex.) 

Example 1 ('Feedback Vertex Set for Undirected Graphs). The cat- 
alytic vertex t is required not to be in the feedback vertex set. If the neighbor 
u is also not in the feedback vertex set, then the edge tu can be contracted. If 
u is in the fvs, then (G', k') is obtained by deleting u, setting k' = k — 1 and 
re-kernelizing. 

Example 2 ('Planar Dominating Set). The catalytic vertex t is required to 
belong to the dominating set. If the neighbor u is not in the dominating set 
then it can be deleted (first branch). On the second branch, the edge tu can be 
contracted, and the resulting graph can be re-kernelized for k — 1. 
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Example 3 (T3dge Dominating Set). (The TPT algorithms for Matrix Dom- 
ination are currently based on a reduction to this problem.) Very similar to 
Example 2. 

Example 4 (TMonblocker (Also called enclaveless sets [HHS98]). The paramet- 
ric dual of Minimum Dominating Set.) 

Input: A graph G = {V, E) where \V\ = n and a positive integer k. 

Parameter: k 

Question: Does G admit a dominating set of size at most n—kl Equivalently, is 
there a set N C V of size k with the property that for every element x G N, 
there is a neighbor y of x in V — N7 

For this problem, a kernelization respecting a catalytic vertex is known. We 
require that the catalytic vertex t be a member of V — N. On the first branch 
of the search tree, we can contract tu if the neighbor u of t is also in V — fV. On 
the second branch, we delete u and re-kernelize for k — 1. The detailed algorithm 
is going to be published elsewehere [FMRSOO] . 

Catalytic branching is a general idea that might be applied in other settings 
besides graph problems — what it really amounts to is a search tree branching 
method based on retaining a small amount of partial information about potential 
solutions. 



4 Concluding Remarks 

How does one evaluate the goodness of an FPT algorithm? Since every prob- 
lem in FPT can be solved in time f{k) + where c is a fixed constant (usually 
c < 3), and there are no hidden constants, we can measure the success of an FPT 
algorithm by its klam value, defined to be the maximum k such that the param- 
eter function f{k) for the algorithm (where c < 3) is bounded by some universal 
limit U on the number of basic operations any computation in our practical 
universe can perform. We will (perhaps too optimistically) take U = 10^*^. This 
might appear a bit strange at first, but parameterized complexity in many ways 
represents a welding of engineering sensibilities (with the attendant sensitivity 
to particular finite ranges of magnitudes), and mathematical complexity anal- 
ysis. Engineers have never been too keen on asymptotic analysis for practical 
situations. 

Max Leaf Spanning Tree was first observed to be in FPT nonconstruc- 
tively via the Robertson-Seymour graph minors machinery by Fellows and Lang- 
ston [FL88]. This approach had a klam value of zero! Bodlaender subsequently 
gave a constructive FPT algorithm based on depth-first search methods with a 
parameter function of around 17k^\ which has a klam value of 1 [Bod89]. This 
was improved by Downey and Fellows [DF95a, DF98] to f{k) = (2fc)^^ which 
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has a klam value of 5. Our algorithm here has a klam value of 16, according to 
our current analysis, which is probably not very tight. ^ 

At this point in time, there are many examples of trajectories of this sort 
in the design of FPT algorithms. Vertex Cover is another classic example of 
such a trajectory of (eventually) striking improvements (see [DF98]). What these 
algorithm design trajectories really show is that we are still discovering the basic 
elements, tricks and habits of mind required to devise efficient FPT algorithms. 
It is a new game and it is a rich game. After many rounds of improvements the 
best known algorithm for Vertex Cover runs in time 0((1.27)^ + n) [CKJ99] 
and has a klam value of 192. Will Max Leaf Spanning Tree admit klam 
values of more than 50? How much more improvement is possible? Can any 
plausible mathematical limits to such improvements be established? 
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Abstract. We present a new scheme for storing a planar graph in exter- 
nal memory so that any online path can be traversed in an I-O efficient 
way. Our storage scheme significantly improves the previous results for 
planar graphs with bounded face size. We also prove an upper bound on 
I-O efficiency of any storage scheme for well-shaped triangulated meshes. 
For these meshes, our storage scheme achieves optimal performance. 



1 Introduction 

There are many search problems in computer science which require efficient ways 
of online traversal in an undirected graph e.g. robot motion planning, searching 
in constraint networks. There are some important problems in computational 
geometry which are also reducible to the efficient online traversal in a graph. For 
example, ray shooting problem in a simple polygon [4] and reporting intersec- 
tion of a line segment with a triangulated mesh. Since most of the applications 
of these problems are of very large scales, it is important to ensure I-O effi- 
cient traversal in undirected graphs. Graph blocking corresponds to storing of 
a graph in external memory so that the number of I-O operations i.e. block- 
transfers required to perform any arbitrary online walk, is minimized. Efficiency 
of a blocking scheme is measured by speed-up a which is the worst case average 
number of steps traversed between two I-O operations. The efficiency of a block- 
ing scheme is proportional to the value of cr. We address the problem of blocking 
of planar undirected graphs. 

We assume that the graph is of bounded degree and a vertex is allowed to 
be present in more than one block. These assumptions are valid for most of the 
applications of the graph-blocking problem. Let us assume that a block can hold 
B vertices and the internal memory can hold M vertices. The parameters B 
and M are related to the block size and the internal-memory size(by a constant 
factor). At any stage of the online walk, the next node to be visited can be any 
neighbor of the most recently visited node. Therefore, it is natural that with 
every node we store its associated adjacency list. In case a neighbor w of a node 
V is not present in the same block as that of v, we must also store the block 
address of w in adjacency list of v. A simple observation is as follows. At any 
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time while traversing a graph, let v be the most recently visited node. If the 
nearest node which is not present in internal memory lies at distance k from v, 
then a block transfer can be forced in next k steps. Based on this observation, 
here is a naive blocking scheme: Let By be the set of nodes lying in a hreadth- 
first-search tree (BPS tree) of size B rooted at node v. For every node v of the 
graph, store By in a block on disk, and whenever walk extends to v and v is not 
present in internal memory, bring in the block storing By . This blocking scheme 
ensures speed-up of r~{B), which denotes the minimum depth of a BFS tree of 
size B in the given graph. However, this speed-up is achieved at the expense of 
H-fold blow up in storage requirement. For a blocking scheme to be practical, 
such a large increase in storage requirement is not acceptable. A blocking scheme 
is said to be space optimal if the number of blocks it requires to store a graph is 
at most a constant multiple of the minimum number of blocks required to store 
the graph. Like other previous approaches[l,3] to graph blocking, we do not take 
into account the cost of preprocessing. 

Goodrich et al. [3] gave space optimal blocking scheme for grid graphs and 
complete d— trees. They also gave a nontrivial upper bound of r'^{B) on the 
speed-up that can be achieved in a graph where r~^(B) denotes the maximum 
depth of B size BFS tree in the given graph. They also gave a space optimal 
blocking scheme for the family of graphs with bounded . They did not give 
blocking scheme for general graphs although they conjectured a scheme which 
ensures a speed-up of r~{B) with optimal storage. Agarwal et al. [1] gave a space 
optimal blocking scheme for planar graphs that achieves a speed-up of r~ f\/B). 
For the family of planar graphs B > r~{B) > log^^B, where d is the maximum 
degree of a node in the graph. For the smaller extreme(i.e. r~{B) « log^^H) the 
speed-up achieved by blocking scheme of Agarwal et al.[l] is close to r~{B) - 
more precisely a = in this case. But the speed-up deteriorates steadily 

from r~ (B) as we move away from the smaller extreme of r~{B). As a case in 
point, note that for a planar graph with r~(k) = (where a is some positive 
fraction), the speed-up achieved is just B^th fraction of r~{B). Therefore, in 
the case of planar graphs with r~{B) = \/B, the speed-up achieved is B^ and 
the gap between r~{B) and the speed-up achieved widens even further for the 
planar graphs with larger values of r~{B). 

We present an efficient blocking scheme for general planar graphs with im- 
proved speed-up over that of Agarwal et al.[l]. Our blocking scheme guarantees 
a speed-up of r~{s) where s = min(r“ H) and c is the maximum face 
size in the graph. It can be seen that for the family of planar graphs having 
small face size and r~ (B) > \/B, the speed-up achieved by our blocking scheme 
is L2{r~{B)). Whereas the speed-up achieved by blocking scheme of Agarwal et 
al. [1] deviates further from r~{B) as r~{B) increases, the speed-up achieved 
by our blocking scheme approaches r~(B) as r~(B) increases and the speed-up 
matches r~{B) from the point r~{B) = \/B onwards. There are a large number 
of applications that employ planar graphs with small face size like trees and 
geometric graphs(grids and meshes). Thus for such planar graphs our blocking 
scheme outperforms earlier blocking scheme. We also make an observation that 
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the blocking scheme for undirected planar graphs can be used to achieve good 
speed-up even in case of directed graphs. 

We also prove a bound on the best speed-up achievable in planar mesh in 
terms of degree of local uniformity of mesh and block size. Most of the meshes in 
practical applications possess good degree of local uniformity. Intuitively speak- 
ing, these meshes are well-shaped. We prove that the best worst-case speed-up 
achievable in a planar mesh is 0{\/B), where the constant of proportionality 
depends upon the degree of well-shapedness of the mesh. We use our blocking 
scheme of planar graphs to achieve speed-up in planar mesh that matches this 
bound. 

2 Efficient Blocking of Planar Graphs 

In this section, we shall devise efficient blocking scheme for planar graphs that 
achieves improved speed-up over earlier blocking scheme given by Agarwal et 
al. [1]. First we present a terminology given by them: Set of nodes lying in a 
BFS tree of size k rooted at a node v is called k-neighhorhood for the node v. We 
extend the following idea of partitioning planar graph employed in their blocking 
scheme. Consider a planar graph of size N partitioned into 0{N / B) regions with 
each region containing at most B nodes and surrounded by boundary nodes such 
that every path going from a node in one region to a node in another region will 
pass through one or more of these boundary nodes. By storing B -neighborhood 
around every boundary node in a block and storing nodes of a region together 
in a block, the following block-transfer strategy(referred as t henceforth) will 
ensure speed-up of cr = fi{r~{B)) for any online traversal. 

Strategy t : Whenever walk extends to a node, say v not present in internal 
memory, if v is a boundary node read the block storing B-neighborhood of v from 
the disk otherwise read the block corresponding to the region in which v lies. 

It is apparently clear that for every two blocks that we load from disk, we tra- 
verse at least r~{B) nodes of the path and thus speed-up achieved is n{r~{B)). 
For meeting the space optimality constraint, we have to make sure that the space 
used for storing B-neighborhoods around the boundary nodes is 0{^) blocks. 

Agarwal et al.[l] gave an efficient blocking scheme along above lines. They 
used a technique developed by Fredrickson [2] for partitioning planar graphs. 
Based on the separator theorem of Lipton and Tarjan [5], Fredrickson gave an 
algorithm for partitioning a planar graph into O(^) regions, with each region 
having at most B nodes and the total number of boundary nodes being 0(-^). 

Storing each of 0{^) regions in blocks, and storing '/B-neighborhood around 
every boundary node, it can be seen that using block-transfer strategy t, the 
speed-up achieved is Q{r~ {'/B)) and storage space required is optimal. 

For the class of planar graphs with bounded degree, r~{B) can be as small 
as log^ B on one extreme (where d is the maximum degree of a node in a graph) 
and as large as B on the other extreme. For planar graphs with r~ (B) « log^ B, 
speed-up achieved using above blocking scheme is | log^ B, which is indeed close 
to r~{B). Though the speed-up achieved is close to r~{B) for small values of 
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r~{B), it degrades drastically as r~{B) increases. To appreciate this point, note 
that for planar graphs having r~ (k) = (where a is some constant < 1), the 
speed-up is just th fraction of r~{B). 

We devise a refinement of the above mentioned blocking scheme for achieving 
better speed-up. Note that using the above blocking scheme speed-up achieved is 
r~ {^/B), because we stored \/B -neighborhood around every boundary node in the 
partition. To improve the speed-up, we should store neighborhood of size greater 
than \/B around every boundary node. But the number of boundary nodes being 
any attempt to increase the size of neighborhood around a boundary 

node beyond \/B will lead to nonlinear space(an undesirable situation). Here 
we make a useful observation: We need not store separate y/~B -neighborhoods for 
boundary nodes which are closely placed. For example, let u be a boundary node, 
and vi,V 2 , ■ ■ ■ ,Vj be other boundary nodes which lie within distance of ^ 
from V. Starting from any of these boundary nodes, we must traverse at least 
^ steps to cross ^/B -neighborhood of v (it follows from triangle inequality). 

Therefore, instead of storing \/B -neighborhood around every node Vi G vi, - ■ ■ ,Vj, 
we can just store \/B -neighborhood around v only (as a common neighborhood 
for vi, ■ ■ ■ ,Vj). In doing so, the speed-up is reduced at most by half; but we 
will be storing less than neighborhoods. This reduction in total number of 
neighborhoods allows us to increase corresponding size of neighborhood (while 
still maintaining the linear space constraint). For this idea to be useful, the 
partitioning scheme must ensure that the separator nodes be contiguous. The 
separator computed using Lipton Tarjan separator theorem does not guarantee 
a separator with sufficiently clustered nodes. The planar-separator theorem given 
by Miller [6] shows existence of a node or a cycle as a balanced separator for a 
planar graph. For the case when the separator is a cycle we can form clusters 
of the separator-nodes by breaking the cycle into appropriate number of equally 
long chains. The set of nodes belonging to a chain define a cluster. We finally 
store just one neighborhood per cluster. Having given this basic idea we shall 
now describe the new blocking scheme and calculate the speed-up that can be 
achieved in various planar graphs based on it. First we state the planar-separator 
theorem given by Miller [6] 

Theorem 1 (Miller). If G is embedded planar graph consisting of N nodes, 
then there exists a balanced separator which is a vertex or a simple cycle of 
size at most 2y^2.[|JfV, where c is maximum face size. Such a separator is 
constructible in linear sequential time. 

Based on the above separator theorem, we present a new blocking scheme 
which gives improved speed-up for planar graphs with bounded face size. 

Let G{V, E) be the given planar graph with maximum face size equal to c 
and N be the number of nodes of the graph. If fV < H we store G in a block on 
disk; otherwise we proceed as follows: we compute separator using Miller’s theo- 
rem given above. If the separator is a vertex v, we store B-neighborhood around 
V, otherwise (separator is a cycle C of size < 2\/ cN) let s be a number in the 
range {'/B, B) (depending on the underlying graph) that will be specified later. 
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Pick every ^th node of C to form a set S. For every node v G S, store the 
s-neighhorhood of u in a block. Associate every node w of separator C with the 
block which contains the s-neighborhood oi v G S nearest to w (let us denote the 
block associated with a boundary node w by B^^). Whenever path extends to a 
boundary node w, and w is not present in internal memory we shall bring the 
block containing into internal memory. Now let the separator C partition V 
into two subsets Pi and P 2 , each of size at most |fV. We recursively carry out 
blocking of subgraphs induced by Pi and P 2 ■ 



It can be verified that using block-transfer strategy r, the speed-up achieved 
using above blocking scheme is I7(r“(s)). So larger value of s will result in 
larger speed-up. For a given s, total space used for blocking according to above 
described scheme can be expressed by the following recurrence: 



S{N) = S{Ni) - 
the solution of which is 



■ S{N2) 



r-{s) ■' 



where 



2N 

Ni,N2 < — 



S{N) = C1N + C2- 



V^N 



'/Br~{s) 

where ci , C 2 are constants independent of s. 

To maximize s and keeping linear space constraint (5(fV) = 

(r“(s)-y/^, s); i.e. s is the largest number k < 



s = mm 

re 



0{N)), we choose 
B such that k < 



Theorem 2. A planar graph of size N and maximum face size c can he stored in 
O(^) blocks so that any online path of length t can be traversed using O (^ r-\s ) ) 

I-O operations where s = min ^r“(s) 



The new blocking scheme gives improvement in speed-up for planar graphs with 
bounded face size. The important point is that the improvement achieved is most 
significant in case of graphs with r~(k) = A:“(the graphs for which the previous 
blocking scheme fails). 



Remark 1. For planar graphs with r~{B) = il{\/B) and maximum face size 
= c, value of s is equal to and we get speed-up equal to r~ (-^), which is a 
significant improvement for planar graphs with small c, over previous speed-up 
of r~{\/B), achieved by blocking scheme of Agarwal et al. [1]. 

Remark 2. For a tree we get speed-up which is equal to r~{B). 

Remark 3. For planar graphs with r~(k) = /c“, for some constant a < ^, value of 

1 Q 

s is (-f ) . Thus speed up achieved by the new blocking scheme is (-y) 

which is a significant improvement for planar graphs with small c (maximum face 
size) over the previous speed-up of B 2 achieved by blocking scheme of Agarwal 
et al. [1]. 
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A Blocking scheme for undirected graphs can be employed for directed graphs 
in the following way. Let Gd be given directed planar graph and be the undi- 
rected graph constructed by ignoring the direction of edges in Gd- Let Adjd{x) 
and Adju{x) be adjacency lists of a node x in the graphs Gd and G„ respec- 
tively. Based on the given blocking scheme for undirected graphs, let p be the 
storage description of G„ in blocks of external memory where each block stores 
adjacency lists of some B nodes of graph G„. For every node x lying in a block 
b, replace Adju{x) by Adjd{x), and carry out this step for all the blocks storing 
Gu- This gives a storage description p' for Gd- Since Adjd{x) C Adju{x), we shall 
be able to keep all the nodes earlier belonging to a block, still in a block (thus 
preserving the locality defined by p). Also note that a path in Gd exists in G„ 
as well. Therefore, if p ensures that k is the worst-case average-number of steps 
of walk performed on G„ between two block transfers, p' will ensure that the 
worst-case average-number of steps of walk in Gd between two block transfers 
is at least k- Thus the speed-up achieved in Gd by the new (adapted) blocking 
scheme is at least as much as the speed-up achieved in G„ by the given block- 
ing scheme for undirected graphs. We can combine this observation with our 
blocking scheme for planar undirected graph to state the following theorem. 

Theorem 3. A planar directed graph G of size N and maximum face size c can 
he stored in O(^) blocks so that any online path of length t can he traversed 

using 0(p=^) I-O operations where s = min (^r~ (s) and r~{s) is the 
minimum depth of a BFS-tree of size s in the undirected graph formed by ignoring 
the direction of edges in G. 

The above blocking scheme is useful for I-O efficient traversal in a planar 
directed graph especially when the underlying undirected graph has significantly 
large value of r~{B)- 

3 Blocking of Planar Mesh 

In various problems of scientific computing, graphs are often defined geometri- 
cally; for example, grid graphs and graphs in VLSI technology. In addition to 
combinatorial structure, these graphs also have geometric structure associated 
with them. One such family of graphs is mesh- A mesh in d-dimensional space 
is a subdivision of a d-dimensional domain into simplices which meet only at 
shared faces e.g. a mesh in 2-dimension is a triangulation of a planar region 
where triangles intersect only at shared edges and vertices. 

Unlike a grid graph, where edges are of same length throughout and posi- 
tioning of vertices has high degree of symmetry, a mesh need not be uniform and 
symmetric. We define two parameters a, 7 to be associated with a planar mesh 
which (intuitively speaking) measure its well-shapedness- We address blocking of 
planar mesh. We prove two results: First, we show that the maximum worst-case 
speed-up achievable in a planar mesh is 0{Vb), where the constant of propor- 
tionality depends upon parameters a, 7. Next, we use the blocking scheme of 
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planar graphs described in previous section to achieve a speed-up of 17 (VB), 
where the constant of proportionality that depends upon a, 7 becomes smaller 
as well-shapedness of mesh reduces. Thus for meshes having good degree of well- 
shapedness, the speed-up achieved by the blocking scheme matches the best 
possible. 

3.1 Well- Shaped Planar Meshes 

A planar mesh is a triangulation of a region in 2-dimensions where the triangles 
meet only at shared edges and vertices. For simplicity, we assume that a planar 
mesh extends infinitely in all directions (e.g. a mesh embedded on a torus or a 
sphere) . A planar mesh need not possess perfect uniformity and symmetry like a 
grid graph. There may be variation in edge-lengths and density of vertices as we 
move from one region to another region in the mesh. But as observed in most of 
practical applications, there is certain degree of local uniformity present in mesh 
i.e. in a neighborhood around a vertex there is not too much variation in edge- 
lengths and vertex-density though the variation may be unbounded for the whole 
mesh. Visualizing mesh as a triangulation, this local uniformity can be viewed in 
the following way: the triangles constituting the mesh are fat and the variation 
of size(area) of these triangles is bounded in a finite neighborhood. This local 
uniformity captures formally the notion of well-shapedness of a planar mesh. 
We now define parameters to measure the local uniformity of a planar mesh. 
We parameterize fatness of triangles by the smallest angle a of a triangle in 
planar mesh. We parameterize the variation in sizes of triangles within a B- 
neighborhood in the following way : Let m be a node and be the set of nodes 
of a BFS tree of size B rooted at u. Let Ag be the set of triangles with at 
least one vertex belonging to i?„. 7 is defined as the ratio of area of the largest- 
area triangle to area of the smallest-area triangle belonging to the set Ag. The 
parameters (a, 7 ) thus defined measure local uniformity of a planar mesh. 

The area of any triangle in the set A^ will lie in the range [A, 7 A] for some 
A. Using elementary geometry it can be shown that length, I of any edge in the 
subgraph induced by B^ has the following bounds: 



^min — 2 ^ A t&n 2 — ^ ^ \/ 'yA cot CX — Gfnax 



Lemma 4. r (B) of a planar mesh with parameters 0,7 is 

Proof. Consider an arbitrary node u of planar mesh and let I be the depth of 
BFS tree of size B rooted at u. Let w be a node of at maximum Euclidean 
distance dmax from u. Consider a circle C centered at u with radius dmax- The 
number of nodes lying in the circle is at least B since lies inside it. Thus the 
number of triangles lying inside the circle is at least y . Note that the maximum 

■Kd^ 

number of triangles lying inside a circle of radius dmax is < — mia. ^ Hence the 
following inequality must hold: j,e., dmax > \A\[^- 
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dmax is bounded by Icmax- Thus I > — \j -f . Using the bound on tmax^ and 

the definition of r~{B), it follows that r~{B) = y/B). 

For a planar mesh, the following theorem gives a lower bound on Euclidean 
distance between two nodes in terms of the length of the shortest path separating 
them. 

Theorem 5. Let v he a node belonging to B^ in a planar mesh. If Puv is the 
shortest path-length from u to v, the Euclidean distance, duv between u and 
V is n{puvVA), where the constant of proportionality depends upon the well- 
shapedness parameters (a,j) of the mesh. 

Proof. Let w be a boundary node of the neighborhood B^ which is at the closest 
Euclidean distance from u and Af be the set of nodes lying within distance duw 
from u. It can be seen that Af C B^. To prove the theorem it would suffice if we 
can show that for a node v G Af separated by shortest path of length I from u, 
the Euclidean distance duv is f2{ly/A). We proceed as follows: Let u be a node 
belonging to Af and S be the line segment joining u and v. We build a path Zw 




Fig. 1. Zig-zag path between nodes u and v of neighborhood in a planar 
mesh 



as we move along S from u. Let p denotes the vertex most recently added to the 
path we are building (initially p = u). We add edges to our path maintaining 
the following invariant : 

I : p lies on the edge most recently intersected by S; and every edge forming the 
path has some point in common with S. 
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While moving along S building the path, let e be an edge intersected by S. 
If e contains p, we keep p unchanged. Otherwise (e does not contain p), there 
is one end point say q of edge e adjacent to p such that pq is the valid edge 
to be added to our path maintaining the invariant I. So we extend our path 
by adding the edge pq to it and p gets updated to q now. We continue this 
process until we reach v. In the special case when S passes through a node, 
say X, we update p to x. We shall now bound length of this zig-zag shaped 
path Zuv The segment S intersects triangles of only and so the ratio of 
areas of intersected triangles is bounded by 7 . Path Zuv divides the segment 
S into subsegments whose total number is equal to number of nodes lying on 
path Zuv excluding u and v. Consider any three consecutive edges of Zw There 
will be one (or two adjacent) subsegment(s) of S intercepted between these three 
edges. Because of the constraints imposed by bounded a and 7 , the length of the 
intercepted subsegment (or sum of the lengths of two intercepted subsegments) 
is at least Cmin sin a. Hence the number of subsegments into which the segment 
S is divided by the path Zw (and so length of the path z„„) is at most • 

Using the bound on emin (given before), it follows that length of the path Zuv is 
at most , where Cq, = 2 y^tan ^ sin a. 

Since length of the shortest path between u,v is less than or equal to the 
length of path Thus 




( 1 ) 



In other words, if u is a node belonging to separated from u by Euclidean 
distance duv and Puv be the length of the shortest path between u and v, then 



duv ^ ^aPuv^ A 



(2) 



We showed in Lemma 4 that shortest path-length from u to a boundary node 
of Bu is r~{B) = J?( ^^° -\/H). By substituting the value of r~{B) for in 
equation 2, we get the following Corollary: 

Corollary 6. The minimum Euclidean distance between u and boundary node 
of Bu in a planar mesh is p> Coc^J^^^^/bVA- 

We now state the following Lemma which gives an upper bound on r~^(B): 

Lemma 7. r'^{B) for a planar mesh with parameters a,j is O 

Above Lemma is based on equation 2. The arguments used to prove the Lemma 
are similar to those used in Lemma 4 and thus the details are omitted. 



3.2 Upper Bound on Speed-Up in a Planar Mesh 

Coodrich et al. [3] proved an upper bound of r+(H) on the best worst-case speed- 
up achievable in a graph. We showed in previous subsection (Lemma 7)that r~^ (B) 
for a planar mesh is 0{^\/B). We can now state the following theorem: 
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Theorem 8. For a planar mesh, the best worst-case speed-up that can he a- 
chieved is a = 0{Ca-yVB), where Ca-y is a constant depending upon the parame- 
ters (a, 7) which capture the well-shapedness of the mesh. 

For sake of completeness, we give an alternate proof for upper bound on the 
speed-up in a planar mesh. Consider a planar mesh in x-y plane. We present a 
traversal strategy which will ensure one block transfer on an average for every 
0{\/B) steps traversed, irrespective of the underlying blocking scheme. We in- 
troduce a terminology here : a node is said to be covered if it happens to be 
in internal memory at least once. Initially, before starting traversal, no node is 
present in internal memory, and so all the nodes are uncovered. Let u be the 
most recently visited node(path-front). It is obvious that if there is an uncovered 
node separated by a path of length < \/B from u, we can extend our path to 
that uncovered node(and thus force a block transfer in \/B steps). But what if 
all the nodes separated by paths of length < \/B from u are covered? Note that 
at least one block-transfer is required to cover a set of B uncovered nodes. So 
in case there is no uncovered node separated by path of length \/B from the 
path-front, we move the next Vb steps in such a way that we can associate 
distinct f2{B) covered nodes to these steps. This would still imply that there 
is a block transfer after every 0{Vb) steps on average. This is the basic idea 
underlying the traversal strategy. 

For a node u of mesh, Cellu denotes a square with base parallel to x-axis and 
with u lying on its left vertical side. Length of each of its four sides is chosen to 
be p/2, where p is the minimum Euclidean distance between u and a boundary 
node of i?„. It follows from Corollary 6 that the number of nodes lying in Cellu 
is more than CgB and every node of Cellu is reachable from u by path of length 
less than c\/B for some constants c, Co depending upon well-shapedness of the 
mesh. Here is the traversal strategy : 

We start from any vertex and always move rightward within the mesh. The 
path can be visualized as a sequence of sub-paths of type p' and p^. At a point, 
let V be the most recently visited vertex. If there is any uncovered node inside 
Cellu, we extend our path to that uncovered node(we call it sub-path of type 
p') and thus force a block-read from disk; otherwise we extend our path to a 
covered node lying closest to the right edge of Celly (we call it sub-path of type 

p^). 

Let b be the number of block transfers encountered in traversing t steps 
in the mesh according to above strategy. Every sub-path of type p' causes a 
block transfer. So the number of sub-paths of type p' is at most b. Also note 
that for every sub-path of type p^, the number of covered nodes lying to the 
left of path-front in the mesh increases by CqB. Thus we can associate a set of 
unique CqB covered nodes to a sub-path of type p^ (uniqueness follows from the 
unidirectionality of motion) . Since a block transfer can cover at most B nodes, 
it follows that the number of sub-paths of type p^ is at most So the total 
number of sub-paths(of type p' and p^) is bounded by Also note that the 
length of each sub-path is no more than c\/B ( from definition of Cell above). 
Hence t < or in other words, the number of block-transfers, b required 
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(i) (ii) 



(i) Subpath of type p’ if there is any uncovered node in Cell^ 

(ii) Subpath of type P” if all the nodes of Cell^ are covered 

Fig. 2. Two types of sub-paths from a node v in the mesh (the nodes lying in 
the shaded region are covered nodes) 



to traverse t steps is Hence we can conclude that the traversal strategy 

described above will ensure a block transfer after every 0{'/B) steps on an 
average, irrespective of the underlying blocking scheme of the mesh. 

3.3 Efficient Blocking of Planar Meshes 

We can block planar meshes efficiently using our blocking scheme for planar 
graphs described in Section 2. From Lemma 4 it follows that r~{k) = 
for k < B. Also note that face size in a planar mesh is 3. So 
It follows easily from Theorem 2 that our blocking scheme guarantees speed-up 
of in a planar mesh with parameters 0,7. In previous subsection we 

established an upper bound of on the speed-up in a planar mesh. Thus 

our blocking scheme achieves optimal speed-up in a planar mesh having good 
degree of well-shapedness. 

Theorem 9. There is a space optimal blocking scheme which ensures a speed- 
up of in a planar mesh, where the parameters (a,^) measure the 

well-shapedness of the mesh. 

4 Conclusions 

We addressed the problem of planar graph blocking in this paper. We described 
a blocking scheme which guarantees improved speed-up in planar graphs of 
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bounded face size over previous blocking schemes. We also established a bound 
on the best worst-case speed-up that can be achieved in a planar mesh. For pla- 
nar meshes with good degree of well-shapedness(local uniformity) our blocking 
scheme achieves optimal speed-up. 

There is still no space optimal blocking scheme which can ensure I-O efficient 
traversal in general graphs(not necessarily planar). Such a scheme will help solve 
a large number of problems which require I-O efficient traversal in general graphs. 
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Abstract. The paper presents an extension ^HDC of Higher-order Du- 
ration Calculus {H DC ,[ZGZ99]) by a polyadic least fixed point (/i) oper- 
ator and a class of non-logical symbols with a finite variability restriction 
on their interpretations, which classifies these symbols as intermediate 
between rigid symbols and flexible symbols as known in DC. The op- 
erator and the new kind of symbols enable straightforward specification 
of recursion and data manipulation by HDC. The paper contains a com- 
pleteness theorem about an extension of the proof system for HDC by 
axioms about /r and symbols of finite variability for a class of simple 
fiHDC formulas. The completeness theorem is proved by the method of 
local elimination of the extending operator /r, which was earlier used for 
a similar purpose in [Gue98]. 



Introduction 

Duration calculus(DC', [ZHR91]) has been proved to be a suitable formal sys- 
tem for the specification of the semantics of concurrent real-time programming 
languages[SX98, ZHOO]. The introduction of a least fixed point operator to DC 
was motivated by the need to specify recursive programming constructs simply 
and straightforwardly. Recursive control structures as available in procedural 
programming languages are typically approximated through translation into it- 
erative ones with explicit special storage (stacks). This blurs intuition and can 
add a significant overhead to the complexity of deductive verification. It is also 
an abandonment of the principle of abbreviating away routine elements of proof 
in specialised notations. That is why it is worth having an immediate way not 
only to specify but also to be able to reason about this style of recursion as it 
appears in high level programming languages. 

Recently, an extension of DC by quantifiers which bind state variables 
(boolean valued functions of time) was introduced [ZGZ99]. Systematic studies 
regarding the application of this sort of quantification vci DC had gained speed 
earlier, cf. [Pan95]; HDC allowed the integration of some advanced features of 
DC, such as super-dense chop [ZH96, HX99], into a single general system, called 
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Higher-order Duration Calculus (HDC), and enabled the specification of the 
semantics of temporal specification and programming languages such as Ver- 
ilog and Timed RAISE [ZHOO, LH99] by DC. The kind of completeness of the 
proof system of HDC addressed in [ZGZ99], which is w-completeness, allowed 
to conclude the study of the expressive power of some axioms about the state 
quantifier. 

In this paper we present some axioms about the least fixed point operator in 
HDC and show that adding them to a proof system for HDC yields a complete 
proof system for a fragment of the extension of HDC with this operator, ^HDC. 

The axioms we study are obtained by paraphrasing of the inference rules 
known about the propositional modal /x-calculus(cf. [Koz83, Wal93]), which were 
first introduced to DC in [PR95] . The novelty in our approach is the way we use 
the expressive power of the axioms about the /x-operator in our completeness 
argument, because, unlike the propositional ^-calculus, fj,HDC is a first-order 
logic with a binary modal operator. 

Our method was first developed and applied in [Gue98] to so-called simple 
DC* formulas which were introduced in [DW94] as a DC counterpart of a class 
of finite timed automata. That class was later significantly extended in [DG99, 
GueOO] . In this paper we show the completeness of an extension of a proof system 
for HDC for a corresponding class of simple jj,HDC formulas. 

Our method of proof significantly relies on the exact form of the complete- 
ness of the proof system for HDC, which underlies the extension in focus. The 
completeness theorem about the original proof system for DC'[HZ92] applies to 
the derivability of individual formulas only, and we need to have equivalence 
between the satisfiability of the infinite sets of instances of our new axioms and 
the consistency of these sets together with some other formulas, i.e. we need an 
w-complete proof system for HDC. That is why we use a modification of the 
system from [ZGZ99], which is w-complete with respect to a semantics for HDC, 
shaped after the abstract semantics of IT L, as presented in [Dut95] . Material to 
suggest an w-completeness proof for this modification can be found starting from 
completion of Peano arithmetics by an w-rule (cf. e.g. [Men64]) to [ZNJ99]. The 
completeness result presumed in this paper applies to the class of abstract HDC 
frames with their duration domains satisfying the principle of Archimedes. In- 
formally, this principle states that there are no infinitely small positive durations 
and it holds for the real-time based frame. 

The purpose of the modification of HDC here is to make a form of finite 
variability which is preserved under logical operations explicitly appear in this 
system. The choice to work with Archimedean duration domains is just to pro- 
vide the convenience to axiomatise this kind of finite variability (axiom HDC5 
below) . 

The fragment of ^iHDC language that our completeness result applies to is 
sufficient to provide convenience of the targetted kind for the design and use of 
HDC semantics of practically significant timed languages which admit recursive 
procedure invocations. 
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1 Preliminaries on HDC with Abstract Semantics 

In this section we briefly introduce a version of HDC with abstract 
semantics [ZGZ99], which closely follows the abstract semantics for ITL given 
in [Dut95]. It slightly differs from the one presented in [ZGZ99]. Along with 
quantification over state, we allow quantifiers to bind so-called temporal vari- 
ables and temporal propositional letters with the finite variability property. 



1 . 1 Languages 

A language for HDC is built starting from some given sets of constant symbols 
a, b, c, . . . , function symbols f, g, . . . , relation symbols R, S, . . . , individual 
variables x, y, . . . and state variables P, Q, .... Function symbols and relation 
symbols have arity to indicate the number of arguments they take in terms and 
formulas. Relation symbols and function symbols of arity 0 are also called tem- 
poral propositional letters and temporal variables respectively. Gonstant symbols, 
function symbols and relation symbols can be either rigid or flexible. Flexible 
symbols can be either symbols of finite variability (fv symbols) or not. Rigid 
symbols, fv symbols and (general) flexible and symbols are subjected to different 
restrictions on their interpretations. Every HDC language contains countable 
sets of individual variables, fv temporal propositional letters and fv temporal 
variables, the rigid constant symbol 0, the flexible constant symbol the rigid 
binary function symbol -I- and the rigid binary relation symbol =. Given the sets 
of symbols, state expressions S, terms t and formulas ip va a, HDC language are 
defined by the BNFs: 

S:--Q\p\S^ S 
t ::= c\ J S\f{t, . . . ,t)\'T[t 
ip ::= ±\R{t, ...,t)\p^ p\{p] p)\3xp\3vp\3Pp 
In BNFs for formulas here and below v stands for a fv temporal variable or a fv 
temporal propositional letter. 

Terms and formulas which contain no flexible symbols are called rigid. Terms 
and formulas which contain only fv flexible symbols, rigid symbols and subfor- 
mulas of the kind f S = £ are called fv terms and fv formulas respectively. Terms 
of the kinds t and t are well-formed only if t is a fv term. We call individual 
variables, temporal variables, temporal propositional letters and state variables 
just variables, in case the exact kind of the symbol is not significant. 



1.2 Ftames, Models, and Satisfaction 

Definition 1. A time domain is a linearly ordered set with no end points. Given 
a time domain (T, <), we denote the set {[ti,T 2 ] : ti,T 2 G T, ti < T 2 } of intervals 
in T by I(T). Given (Ji,(T 2 G I(R), where (T,<) is a time domain, we denote 
a I U <72 by (Ji;(J 2 , in case maxcri = mincT 2 . A duration domain is a system of 
the type which satisfies the following axioms 
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(Dl) X + {y + z) = {x + y) + z (D6) x < x 

{D2) a; -I- 0 = a; {D7) x<yAy<x^x = y 

(D3) x + y = x + z^y = z (-D8) x<yAy<z^x<z 

(Z?4) 3z{x + z = y) {D9) x < y AA 3z{x + z = yA0<z) 

{D5) X + y = y + X (-D10) x < y V y < x 

Given a time domain {T, <), and a duration domain {D, 0, -I-, <), m : I(T) ^ D 
is a measure if 

(MO) a; > 0 3a{m{a) = x) 

(Ml) miner = miner' A m(er) = m(a') maxer = maxer' 

(M2) maxer = min er' m(er) -|- m(a') = m(a U er') 

(M3) 0<a;A0<yA m(er) = a:-|-y=i'3rGer m([min er, r]) = x. 



Definition 2. A HDC frame is a tuple of the kind ((T, <), (D, 0, -h, <), m), 
where (T,<) is a time domain, {D,0,+,<) is a duration domain, and m : 
I(T) ^ D is a measure. 



Definition 3. Given a HDC frame F = {{T,<), {D,0,+,<),m) and a HDC 
language L, a function I which is defined on the set of the non-logical symbols 
of L is called interpretation of L into F, if 

o /(c), /(a;) € D for constant symbols c and individual variables x 
o I[f) : D" ^ D for rigid n-place function symbols f 
o Hf) ■■ m X D” ^ D for flexible n-place function symbols f 
o I{R) : -D" ^ {0, 1} for rigid n-place relation symbols R 
o I{R) : I(T) X ^ {0, 1} for flexible n-place relation symbols R 
o l\p) : r ^ {0, 1} for state variables P 
o /(O) = 0, /(/) = m, /(+) = + and /(=) is =. 

The following finite variability condition is imposed on interpretations of state 
variables P: 

Every a € I(T) can be represented in the form eri; . . . ; er^ so that I{P) 
is constant on [mineri,maxeri), i = 

A similar condition is imposed on the interpretations of fv symbols s. Given a 
frame F and an interpretation I as above, and a G 1{T), a function (predicate) 
A on I(T) X _D” is called fv in F, I with respect to eri, . . . , er^ G I(T) iff a = 
for some interval a and for all d\, ... ,dn G D, i,j < m, i < j, 

o' G I(T); 

o if miner' G (minerj,maxeri) and maxer' G (minerj,maxerj), 

A(o', dl, ... , dn) is determined by d\, ... ,dn i and j only; 
o if min er' = min Oi and max o' G (min Oj , max Oj), A{o' ,di, . . . ,dn) is 
determined by d\, ... ,dn i and j only, possibly in a different way; 
o z/ miner' G (min er^, maxer^) and maxer' = miner^, A{o' ,di, . . . ,dn) is 
determined by d\, ... ,dn i and j only, possibly in a different way; 
oz/ miner' = miner^ and maxer' = miner^, A{o' ,di, . . . ,dn) is deter- 
mined by dl, ... ,dn i and j only, possibly in a different way. 
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A symbol s is fv with respect to a\, . . . , am in some F, I as above, if I (s) has 
the corresponding property. Given a fv symbol s, for every a G 1{T) there should 
be ai, . . . , am G I(T) such that s is fv with respect to ai, . . . , am in F, F 
Given a language L, a pair {F, I) is a model for L, if F is a frame and I is an 
interpretation o/L into F. 

Interpretations / and J of language L into frame F are said to s-agree, if 
they assign the same values to all non-logical symbols from L, but possibly s. 

Given a frame F (model M) we denote its components by {Tp, <p), 
{Dp,0p,+F,<p) and mp {{Tm,<m), (-Dm, Om, +m, <m) and itim) respec- 
tively. We denote the frame and the interpretation of a given model M by Im 
and Fm respectively. 



Definition 4. Given a model M = {F, I) for the language L, r G Tm and 
a G I(T'm) the values It{S) and Ia{t) of state expressions S and terms t and 
the satisfaction of formulas ip are defined by induction on their construction as 
follows: 

lr{ 0 ) = 0 

Ir{P) = I{P){r) 

Ir{Sl ^ 52) = max{l - Ir{Si), 1 ^ 82 )} . 

Ia{c) = I {a) for rigid c 

la-(c) = 1 (c) (a) for flexible c 

lAJS) =QT:i^(S)dr 

Ia(f{tl, . ■ .,tn)) = I{f){Ia(tl), ■ ■ . , Ia(tn)) for rigid f 

laiHG, ■ ■ ■ An)) = I{f){a,I„(ti),...,I„{tn)) for flexible f 

I(t( t ) = d if la' (t) = d for some t < min cr and all a' C (r, min a) 

Ia{ t) = d if la' (f) = d for some r > maxcr and all a' C (maxcr, r) 

M,a^F 

M,a\= R(ti, ...,tn) iff I(R)(Ia(ti ), . . .,Ia(tn)) = 1 for rigid R 
M,a ^ R(ti,...,tn) iff I(R)(a, Ia(h), . . . , Ia(tn)) = 1 for flexible R 
M, a \= (fi ^ if iff either M, a \= if or M, a ^ p 

M, a 1= (p\ if) iff there exist a\, a 2 G I(Tp) such that a = a\; U 2 , 

M, cTi 1= p and M, (T 2 \= if 

M, a ^ 3xp iff {F, J),a \= p for some J which x-agrees with I 

Note that discrete time domains, which make the above definitions of "Tf and 
incorrect, also render any ’’corrected” definition for these operators grossly non- 
introspective, and therefore these operators should be disregarded in the case 
of discrete domains. In the clause about 3a; above x stands for variable of an 
arbitrary kind, temporal variables and propositional temporal letters included. 
The integral used to define values of terms of the kind / S above is defined as 
follows. Given a and S, there exist cti, . . . , (j„ G l(Tp) such that a = a\; an 
and It(S) is constant in [min (7^; maxfji), i = 1, . . . ,n. Given such a partitition 
ai, . . . ,a„ of cr, we put: 
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f Ir{S)dT = Y. 

miner z=l,...,n, /min (5') = 1 

Clearly, the value thus defined does not depend on the choice of cti, . . . , <t„. 

1.3 Abbreviations 

Infix notation and propositional constant T, connectives A, V and and 
quantifier V are introduced as abbreviations in the usual way. 1 stands for 0 0 

in state expressions. The relation symbol < is defined by the axiom x < y 
3z{x + z = y). The related symbols >, < and > are introduced in the usual way. 
We use the following HC-specific abbreviations: 

[S'] ^ fS = £Aiy^0, 0(p ^ ((T ; ip); T), Up ^ n.t t + . . . + t. 

' V ^ 

n times 

OiP ^ yf 0; yf 0), ^ ((^ = A.^ = t2;T). 



1.4 Proof System 



Results in the rest of this paper hold for the class of DC models which satisfy 
the principle of Archimedes. It states that given positive durations di and ^ 2 , 
there exists a natural number n such that n.di > c? 2 - 

Here follows a proof system for HDC which is w-complete with respect to 
the class of HDC models which satisfy the principle of Archimedes: 



(All) (y>; %l)) A -.(x; i/;) ^ (y> A -.y; i>) 

(Air) A^{(p-,x) ^ A^x) {MP) 
(A2) {{ifi-i>)-x) {p-,{i>’,x)) 

(Ri) ^ p> a p> is rigid \^i) 

(Rr) (y>;'0) V’ if V’ is rigid 
(Bi) ^ 3x((p-,tp) ii X ^ FV{ip) 

(Br) Ip-jBxtp) ^ 3xl<p;tp) A X FV{tp) 

(Lit) {£ = x\(p) ^ = x\ -^ip) 

(Llr) (y’;f = a;) => = x) 

(L2) £ = X + y 4^ {£ = x-,£ = y) (lo) 

{L3i) ip ^ (£ = O', ip) 

(L3r) p ^ Ip; £ = 0) 

(DCO) £ = 0^ f S^O 
(DCl) / 0 = 0 
(DC2) [1] V.t = 0 
(DCS) (fS = x; fS]A£ = y) 



p p ^ Ip 

Ip 

V 



<- 



^p;%p) 

p ^ Ip 



(G) 

(Nr 



J£_ 

Vxp 






^(%p;^p) 



(Monoi) (p;x) 

f_ 

(Mono, 



(V’;x) 



(x-,<p) => (x; V’) 

yk<uj [(f GOVTS'] vhS'DVRJvs 



Vn < LO p - 



[T/R]p 
n.x < y 



(Arch) p ^ X <Q 

(DC4) (f S = x; hS'l) ^ j S = x 
(DCS) [S'llAfSal^y [5 'iAS2] 
(DC6) [^i] yy \S2~\, if bpc Si yy S 2 . 
JS = x + y (DC7) [S] ^ a(\S]v£ = 0) 

(PHI) (£ ^ 0;*T = X A £ = y) 44 (T; (□i(t = x) A £ 0; £ = y)) 

(PV2) (~t = X A £ = y; £ 0) 44 ((£ = y; Ui(t = x) A £ ^ Q); T) 

((£ = a;p);£ = b) ^ ((£ = a;%p);£ = b) 



(NL) p^%p 

(3„) \t/v]p =y 3vp for fv-terms t and temporai variables v; 

(3p) [%p/p]p =y> 3pyi for fv-formulas p) and temporal propositional letters p; 

(HDCl) 3u(V = x) 

(HDC2) 3v(lt = x) 

(HDCS) (3Sp; 3Sp)) 44 3S(p; ip) 
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(HDCSv^i) X < £ ^ 3i>Vj/iVj/ 2('7’' = tl A A 

A{yi<xAy 2 <xAyi<y 2 ^ Cai,a2 (« = ^i))^ 

A(j/i > xAy2> xAyi <y2Ay2 < ^ Cs/i,^ (« = ^2))A 
A{yi <x Ay2> X Ay2 < ^ Cs/i,^ (« = ^s))) 

(HDC3v,r) X < £ ^ 3iiVj/iVj/2('7’' = ti A if = T2A 

A{yi < X Ay2 < X Ayi <y2 ^ (« = ^i))A 

A{yi >xAy2>xAyi<y2Ay2 < ^ Cs/i,^ (« = ^2))A 

A{yi <xAy2>xAy2 < £ => ^yi,v2{x = ts))) 

(HDC 3 p,i) x<£^ 3 pVyiVy 2 ( 

(yi <xAy2 <xAyi <y2^ Cai.s/2(P V’i)A 

A{yi > X Ay2 > X Ayi <y2 Ay2 < £ ^ ^yi.y^iP i>2))A 

A{yi <xAy2>xAy2 < ^ Cs/i,^ (P V’a))) 

{HDCSp,r) x<£^ 3 pVyiVy 2 ( 

(yi < X Ay2 < X Ayi <y2 ^ ^ai.s/2(P ^ V’i))A 

A(j/i > a;Ay 2 > a;Ayi < j /2 Ay 2 < ^ Cs/i,K2 (P ^ 2 ))A 

A{yi <xAy2>xAy2 < Cs/i,a2 (P V’s))) 

{HDC 4 ) VxVj/(((p A £ = x-jt/j) A -'(ip A £ = ^ X < y) ^ 

=> 3x(Vj/(((p A £ = y\ij)) y < x) 1 3a;(Vy((</9 A £ = y, ip) 4 ^ y < x) 

(HDC 5 ) £^ 0 ^ 3 x{x ^ OA 

Vyn(</9 A 0(-i(p A 0(</9 A 0(-i</9 A <>{ip A£ = y)))) £ < x + y)) 

The symbol x denotes a variable of an arbitrary kind in the rule G and the 
axioms Bi and Bp. Instances of HDC3^., HDCA and HDC5 are valid only 
if v,p,x,y,yi,y 2 ^ FV(ti), FV(t 2 ), FV^ts), FV{ipi), FV{ip 2 ), FV{ip 3 ), FV{ip), 
FV (ip) and ti, £ 2 , £ 3 , ipi, ip 2 , 'P’ 3 , <P and ip are fv terms and formulas respectively. 

The proof system also includes the axioms Dl-ZJlO for duration domains, first 
order axioms and equality axioms. Substitution \plx\ip of variable x by term £ in 
formula y is allowed in proofs only if either £ is rigid, or x is not in the scope of 
a modal operator. 

Note that this proof system is slightly different from the original HDC one, 
as fv symbols are not considered in FIDO as in [ZGZ99]. Nevertheless, its w- 
completeness can be shown in way that is similar to the one taken in [ZNJ99]. 

The meaning of the new axioms HDCl, HDC2 and HDC3^. is to enable the 
construction of fv functions and predicates on the set of intervals of the given 
model (from simpler ones). Given that a language L has rigid constants to name 
all the durations in a model M for it, as in the case of canonical models which 
are used in the completeness argument for this system, the existence of every fv 
function and predicate on I(Tm) can be shown using these axioms. The axioms 
HDC A and HDC5 express the restrictions on the interpretations of fv formulas, 
and hence - the fv symbols occurring in them. The following w-completeness 
theorem holds about this proof system: 

Theorem 1. Le£ F be a consis£en£ se£ of formulas from £he language L of HDC . 
Then £here exis£s a model M for L and an inlerval a € 1{Tm) such £ha£ M, a \= y 
for all (p G F. 
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2 fiHDC 

In this section we briefly introduce the extension of HDC by a least flxed point 
operator. 

2.1 Languages of jjbHDC 

A language of fj.HDC is built using the same sets of symbols as for HDC lan- 
guages and a distinguished countable set of propositional variables X, Y, .... 
Terms are defined as in HDC. The BNF for formulas is extended to allow flxed 
point operator formulas as follows: 

tp ::= l.\X\R{t, p)\piX . . . X.ip, ip\3xip\3vip\3Pip 

Formulas of the kind piXi . . . Xm-Pi , . . . , are well-formed only if to = n, 
all the occurrences of the variables X \, . . . , Xn in (p\, . . . ,ipn are positive, i.e. each 
of these occurrences is in the scope of an even number of negations, Xi , . . . , X„ 
are distinct variables and i G n}. Formulas which contain p, are not 

regarded as fv. Note that we work with a vector form of the least flxed point 
operator. This has some technical advantages, because it enables elimination of 
nested occurrences of p under some additional conditions. 

2.2 Ftames, Models, and Satisfaction 

Frames and models for pHDC languages are as for HDC languages. The only 
relative novelty is the extension of the satisfaction relation |=, which captures 
/i-formulas too. 

Let M = {F,I) be a model for the {pH DC) language L. Let I{(p) denote 
the set {a G I(2f) : M, a \= (p\ for an arbitrary formula p from L. Let s be a 
non-logical symbol in L and a be a constant, function or predicate of the type of 
s. We denote the interpretation of L into F which s-agrees with I and assigns a 
to s by Given a set A C I(Tm), we deflne the function XA ■ I(Tm) ^ {0, 1} 
by putting xa(o') = 1 iff <t G A. 

Now assume that the propositional variables Xi, . . . , Xn occur in ip. We de- 
flne the function ^ by the equality /<^(Ai, . . . , A„) = 

' Assume that the variables Xi, . . . , Xn have only positive oc- 

currences in p. Then is monotone on each of its arguments, i.e. Ai C A{ 
implies /<,,(Ai, . . . , Aj, . . . , A„) C ^(Ai, . . . , A', . . . , A„). 

Now consider a sequence of n formulas, p\, . . . , pn, which have only positive 
occurrences of the variables X \, . . . , Xn in them. Then the system of inclusions 

f (Al , ■ ■ ■ , An) C Ai , i — 1 , . . . , 

has a least solution, which is also a least flxed point of the operator 

AAi . . . An.{f(pi (Ai, . . . , An), ■ ■ ■ : f(p„ (Ai, • • • , A„)). 

Let this solution be {Bi, . . . , Bn), Bi C I(Tf)- We deflne the satisfaction 
relation for 

p^Xi ...Xn.pi,...,Pnhy putting: 

XI, (7 [= PiX\ . . . Xn'Pl, • ■ ■ , Pn Iff ^ G Bj^. 
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3 Simple fiHDC Formulas 

The class of formulas which we call simple in this paper is a straightforward 
extension to the class of simple DC* formulas considered in [Gue98] . We extend 
that class by allowing /i instead of iteration, positive formulas built up of fv 
symbols and existential quantification over the variables which occur in these 
formulas. 



3.1 Super-Dense Chop 



The super-dense chop operator (. o .) was introduced in [ZH96] to enable the 
expression of sequential computation steps which consume negligible time, yet 
occur in some specified causal order, by DC. Given that vi,. .. ,Vn are all the 
free temporal variables of formulas ip and ip, (tp o ip) is equivalent to 

( „ f7,=tpA\ \ 

[v[/vi,...,v'^/Vn]pA A 

i=l 



3v'i . . . . . . 3u"3a;i . . . 3x„ 



= XiA 

V D 

( v'l = ViA 



■AK/vi,. 



V 



■■,v'n/Vn]lpA A 



v” = Xi A 

□f" = Vi 



3.2 Simple Formulas 

Definition 1. Let h be a language for pHDC as above. We call pHDC for- 
mulas 7 which can be defined by the BNF 

7 ::= ±\R{t, ..., t)\X\{j A 7)|7 V 7h7|(7; 7 ) 1(7 o j)\piX . . . X.j , ... ,7 
where R and t stand for either rigid or fv relation symbols and terms respec- 
tively, open fv formulas. We call an open fv formula strictly positive if it has 
no occurrences of propositional variables in the scope of An open fv formula 
is propositionally closed if it has no free occurrences of propositional variables. 
Simple pH DC formulas are defined by the BNF 

p::=£= 0|X| [S'] \\S) A£ Aa\\S) A£a a\\S) A£ AaA£A b\ 
p\J p\{p-, p)\{p o p)\p A A\t^iX . ..X.p, . . .,p\3xp\3vp 
where a and b denote rigid constants, 7 denotes a a propositionally closed strictly 
positive open fv formula, x denotes a variable of arbitrary kind, {<,<} and 
>-G {>,>}. Additionally, a simple formula should not have subformulas of the 
kind 3xp where x has a free occurrence in the scope of a p-operator in p. 

4 A Complete Proof System for the Simple Fragment of 

^iHDC 

In this section we show the completeness of a proof system for the fragment 
of pH DC where the application of p is limited to simple formulas. We add the 
following axioms and rule to the proof system for HDC with abstract semantics: 
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(Mi) • 1 

[fllXi . . . JCjlipij ■ ■ ■ 5 ^nj ■ ■ ■ 7 f^n^l ■ ■ ■ • 7 ‘^n/ 

n 

(Ms) a ^(["0l/-^l7 ■ ■ • ^ '0z) ^ ■ ■ ■ ^n-'T^l7 ■ ■ ■ 7 ^ The 

z=l 

(/is) /iiXi . . .X^.(/ 5 i, . . . , [/iZi . . . Z„.V'l, . . . , V'n/i^]<Pfc, . . . ,<Pm 
. . . J*C^^ Zi . . . Z^ipi , . . . , (prn 5 7 • ■ ■ 5 V^n 

variable y should not have negative free occurrences in tpk in the instances of 
M3- 



4.1 The Completeness Theorem 

Lemma 1. Let ip, a and (3 he HDC formulas and X he a propositional temporal 
letter. Let Y not occur in p in the scope of quantifiers which hind any of the 
variables from FV{a)UFV{f3). Then \~^hdc /?) {[a/Y]p \13/Y]p). 

The following two propositions have a key role in our completeness argument. 
Detailed proofs are given in [GueOOb] . 

Proposition 1. Let j he a propositionally closed strictly positive open fv for- 
mula. Let M he a model for the language h of j and a G 1 (Tm)- Then there 
exists a pi-free propositionally closed strictly positive open fv formula 7 ' such that 
M, a 1= 0(7 7 '). 

This proposition justifies regarding /i formulas with fv subformulas as fv 
formulas. 

Proposition 2 (local elimination of p, from simple formulas). Let p he a 

propositionally closed simple pH DC formula. Let M he a model for the language 
of p and a G 1{Tm). Then there exists a p-free formula ip such that M,a |= 
U[p f)). 

Theorem 1 (completeness). Let F he a set of formulas in a pH DC language 
L. Let every p-suhformula of a formula p G F he simple, and moreover occur 
in p as a suhformula of some propositionally closed p-suhformula of p. Let F 
be consistent with respect to \~^hdc- Then there exists a model M for L and an 
interval a G I(M) such that M,a ^ F. 

Proof. Proposition 1 entails that every fv /r-subformula of a formula from F is 
locally equivalent to a, p free fv formula. Hence occurrences of p in fv subformulas 
can be eliminated using Lemma 1 and we may assume that there are no such 
subformulas. Since nested occurrences of p in /r-subformulas from F can be 
eliminated by appropriate use of /X 3 , we may assume that there are no such 
occurrences. 

Let S = {SfJ,^Xl...x„.vl,...,v.n ■. I < i < n < to, piXi . . . X^.pi, ■ . ■ ,Pn is a 
formula from L} be a set of fresh 0-place flexible relation symbols. Let L(S') be 
the HDC language built using the non- logical symbols of L and the symbols from 
S. Every formula p from L can be represented in the form [ipi/ X\, . . . , 'ifn/Xn]'tf 
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where i/' does not contain /i and contains X \, . . . , and * = 1, • ■ • , n. are 
distinct ^-formulas. This representation is unique. Given this representation of 
(p, we denote the formula X \, . . . , from L(S') by t((^). Note that 

the translation t is invertible and its converse of is defined on the whole L(S'). 

Let A — {□(a) : a is an instance of in L}- Then the set F' = {t((p) : 

(p € TUZ\} is consistent with respect to Assume the contrary. Then there 

exists a proof of T with its premisses in F' in h/zDc- Replacing each formula ip 
in this proof by gives a proof of T from F in \~ DC- 

Hence there exists a model M for L(S') and an interval cr G I(Tm) such that 

M,cr h r'. 

Now let us prove that M,a ^ □(tp s^) for every closed simple formula 
ip from L. Let ip be p-iXi . . .Xn-tpi, ...,'ipn- Let ipk ^ p-kXi . . . Xn-ipi, ■ ■ ■ ,i)n, 
k = 1, ... ,n, for short. Then M satisfies the t-translations 
^ [^ipi / All J / Alnj^fc) 

n 

A ^ [t(6'l)/Ai, . . .t{9n)/Xn]p;j) => □(s,p^ ^ t(^fc)) 

i=i 

of the instances of pL\ and p .2 for all n-tuples of formulas 0i, . . . , from L. The 
first of these instances implies that . . . , evaluates to a fixed point of 
the operator represented by . . . , Consider the instance of /i 2 . Let 9k be 
a /r-free formula from L such that M,a \= □(6*^ ipk) for fc = 1, . . . , n. Such 
formulas exist by Proposition 2. Then t{9k) is 9k and the above instance of /i 2 
is actually 

n 

A [Ql/Xi,. . . , 9n,/Xn]4’j) => ^ 9k) 

i=i 

Besides M,a \= [6*i/Ai, . . . , 9n/ XrP\ipj), j = 1, . . . , n, by the choice of 

9k. Hence M,a \= 9k). This means that {s^^, . . . , evaluates to 

the least fixed point of the operator represented by {ipi, . . . , ipn)- Hence M,a \= 
n(s<p ip) for every ^-formula ip with no nested occurrences of p,. This entails 
that M, cr ^ □((^ <G> t((p)) for every ip G F. Hence, M,a \= F. 
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Abstract. In this paper we present a complete proof system for timed 
automata. It extends our previous axiomatisation of timed bisimulation 
for the class of loop-free timed automata with unique fixpoint induction. 
To our knowledge, this is the first algebraic theory for the whole class 
of timed automata with a completeness result, thus fills a gap in the 
theory of timed automata. The proof of the completeness result relies 
on the notion of symbolic timed bisimulation, adapted from the work on 
value-passing processes. 



1 Introduction 

The last decade has seen a growing interest in extending various concurrency 
theories with timing constructs so that real-time aspects of concurrent systems 
can be modeled and analysed. Among them timed automata [AD94] has stood 
out as a fundamental model for real-timed systems. 

A timed automaton is a finite automaton extended with a finite set of real- 
valued clock variables. A node of a timed automata is associated with an invari- 
ant constraint on the clock variables, while an edge is decorated with a clock 
constraint, an action label, and a subset of clocks to be reset after the transition. 
At each node a timed automaton may perform two kinds of transitions: it may 
let time pass for any amount (a delay transition), as long as the invariant is 
satisfied, or choose an edge whose constraint is met, make the move, reset the 
relevant clocks to zero, and arrive at the target node (an action transition). Two 
timed automata are timed bisimilar if they can match each other’s action tran- 
sitions as well as delay transitions, and their residuals remain timed bisimilar. 
By now most theoretical aspects of timed automata have been well studied, but 
they still lack a satisfactory algebraic theory. 

In this paper we shall develop a complete axiomatisation for timed automata, 
in the form of an inference system, in which the equalities between pairs of timed 
automata that are timed bisimilar can be derived. To this end we first propose 

* Supported by a grant from National Science Foundation of China. 
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a language, in CCS style, equipping it with a symbolic transitional semantics 
in such a way that each term in the language denotes a timed automaton. The 
language has a conditional construct 4>^t, read “if <j) then t” , and an action 
prefixing a(x).t, meaning “perform the action a, reset the clocks in x to zero, 
then behave like t” . The proof system consists of a set of inference rules and 
the standard monoid laws for bisimulation. Roughly speaking the monoid laws 
characterise bisimulation, while the inference rules deal with specific constructs 
in the language. The judgments of the inference system are of the form 



4>\>t = u 



where ^ is a time constraint and t, u are terms. Intuitively it means: t and u 
are timed bisimilar over clock evaluations satisfying <j). A proof system of this 
nature already appeared in our previous work on axiomatising timed automata, 
[LWOO.l], with a serious limitation: it is complete only over the recursion-free 
subset of the language, i.e. the subset of timed automata without loops. A stan- 
dard way of extending such an axiomatisation to deal with recursion, is to add 
the following unique fixpoint induction rule [Mil84]: 

t = u\tlX] ^ 

UFI X- ' , A guarded in u 

t = nxAu 



However, this rule is incompatible with the inference system presented in 
[LWOO.l]: the key inference rule handling action prefixing and clock resetting 
takes the form 

ACTION-DJ xnC(w) = y nC(t) = 0 

(j) > a(x).t = a{y).u \ ) x w 

where ^ixytl is a clock constraint obtained from (j) by first setting the clocks in xy 
to zero (operator J,xy), then removing upper bounds on all clocks of <j) (operator 
'fl'); C{t) and C{u) are the sets of clocks appearing in t and u, respectively. The 
side condition is needed to ensure that the clocks of one process do not get reset 
by the other. Because of this the inference system of [LWOO.l] is mainly used 
for reasoning between terms with disjoint sets of clocks. It is difficult to fit the 
UFI rule into such contexts: it does not make sense to perform the substitution 
t[u/X] when the clock set of u is different from that of t. To overcome this 
difficulty, we replace ACTION-DJ with two rules: 



ACTION 



#xlt > t = M 
(j) > a{x).t = a{'x).u 



and 



THINNING 



i(xy).t = a{'x).t 



y nc(t) = I 



The first rule does not require a side condition. However, the subset of clocks 
reset by the action prefix on both sides of the equation must be the same. This 
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guarantees soundness. On the other hand, “redundant” clocks can be removed 
by THINNING. It is not difficult to see that ACTION-DJ is derivable from the 
above two rules. 

The completeness proof relies on the introduction of the notion of symbolic 
timed bisimulation, t u, which captures timed bisimulation in the following 
sense: t u if and only if tp and up are timed bisimilar for any clock evaluation 
p satisfying <j). Following [Mil84], to show that the inference system is complete, 
that is t u implies \~ (j> t> t = u, we first transform t and u into standard 
equation sets which are the syntactical representations of timed automata. We 
then construct a new equation set out of the two and prove that t and u both 
satisfy the product equation set, by exploiting the assumption that t and u are 
symbolically timed bisimilar. Finally we show that, with the help of UFI, if two 
terms satisfy the same set of standard equations then they are provably equal. 

The result of this paper fills a gap in the theory of timed automata. It demon- 
strates that bisimulation equivalences of timed automata are as mathematically 
tractable as those of standard process algebras. 

The rest of the paper is organised as follows: In the next section we first 
recall the definition of timed automata, then present a language to describe 
them. Section 3 introduces symbolic timed bisimulation. The inference system 
is put forward in Section 4. Section 5 is devoted to proving the completeness of 
the proof system. The paper concludes with Section 6 where related work is also 
briefly discussed. 

Due to space limitation, many details and proofs have been omitted. They 
can be found in the full version of the paper [LW00.2]. 

2 A Language for Timed Automata 

We assume a finite set A for synchronization actions and a finite set C for real- 
valued clock variables. We use a, b etc. to range over A and x, y etc. to range 
over C. We use B{C), ranged over by <j), tp etc., to denote the set of conjunctive 
formulas of atomic constraints in the form: N m or Xj — Xj N n, where 

Xi,Xj G C, Ng {<,<,>,>} and m,n are natural numbers. The elements of 
B{C) are called clock constraints. 

Definition 2.1. A timed automaton over actions A and clocks C is a tuple 
(N,lo,E) where 

— N is a finite set of nodes, 

— lo G N is the initial node, 

— E C N X B{C) X Ax 2^ X N is the set of edges. 

When {I, g, a, r, I') G E, we write I I'. 

We shall present the operational semantics for timed automata in terms of a 
process algebraic language in which each term denotes an automaton. 
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DELAY 

ACTION 

REC 



tp > t{p + d) 



p + d \= Inv(t) CHOICE 



tp 



t'p' 



(t + u)p ^ t'p' 



{a(px.).t)p — > ^p{x := 0} 
(t[flxX^/X])p t'p' 
(fixXi)p — ^ t' p' 



GUARD 



tp 



t'p' 



{4>^t)p — > t'p' 



TJP 



INV 



tp 



t'p' 



mt)p ^ t'p' 



TJP 



h4> 

h4> 



Fig. 1. Standard Transitional Semantics 



We preassume a set of process variables, ranged over by X, Y, Z, — The 
language for timed automata over C can be given by the following BNF grammar: 

s ::= {4>}t 

t ::= 0 I I o(x).s | t + t \ X \ fixXt 

0 is the inactive process which can do nothing, except for allowing time to pass. 

read “if (j) then t”, is the usual (one-armed) conditional construct. a(x).s 
is action prefixing. -|- is nondeterministic choice. The {4>}t construct introduces 
an invariant. 

A recursion &xXt binds X in t. This is the only binding operator in this 
language. It induces the notions of bound and free process variables as usual. 
Terms not containing free process variables are closed. A recursion fixAt is 
guarded if every occurrence of A in t is within the scope of an action prefixing. 

The set of clock variables used in a term t is denoted C (t) . 

A clock valuation is a function from C to R-*^ (non-negative real numbers), 
and we use p to range over clock valuations. The notations p{x := 0} and p + d 
are defined thus 

p{x:=0}(y) = |® 

( p(y) otherwise 

(p -I- d){x) = p{x) + d for all x 

To give a transitional semantics to our language, we first assign each term t 
an invariant constraint Inv{t) by letting 

, { (h if t has the form {d}s 

^^^(^) = |tt otherwise 

We shall require all invariants to be downward-closed: 

For all d S R-°, p d \= 4> implies p \= (j) 

Given a clock valuation p : C ^ R-°, a term can be interpreted according 
to the rules in Figure 1, where the symmetric rule for -|- has been omitted. The 
transitional semantics uses two types of transition relations: action transition 
— ^ and delay transition We call tp a process, where t is a term and p a 
valuation; we use p, q, ... to range over the set of processes. We also write p. for 
either an action or a delay (a real number). 
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Action ■ 



Guard 



l(x).t > t 

t t' 



Choice 



b.a.x / 

t + u 



Inv ■ 









Ip, a 



Rec 



t[fixAt/A] t' 
fixAt t' 



Fig. 2. Symbolic Transitional Semantics 



Definition 2.2. A symmetric relation R over processes is a timed bisimulation 
*/ {Pi ?) G -R implies 

whenever p p' then q q' for some q' with (jf , q') G R. 

We write p ^ q if {p,q) G R for some timed bisimulation R. 

The symbolic transitional semantics of this language is listed in Figure 2. 
Again the symmetric rule for + has been omitted. Note that invariants are simply 
forgotten in the symbolic transitional semantics. This reflects our intention that 
symbolic transitions correspond to edges in timed automata, while invariants 
reside in nodes. 

According to the symbolic semantics, each guarded closed term of the lan- 
guage gives rise to a timed automaton; on the other hand, it is not difficult to 
see that every timed automaton can be generated from a guarded closed term 
in the language. In the sequel we will use the phrases “timed automata” and 
“terms” interchangeably. 

3 Symbolic Timed Bisimulation 

In this section we shall define a symbolic version of timed bisimulation. To sim- 
plify the presentation we fix two timed automata. To avoid clock variables of one 
automaton being reset by the other, we assume the sets of clocks of the two timed 
automata under consideration are disjoint, and write C for the union of the two 
clock sets. ^ Let N be the largest natural number occurring in the constraints 
of the two automata. An atomic constraint over C with ceiling N has one of the 
three forms: x> N,x\Am or x — y\An where x,y G C, Ne {<, <, >, >} and 
m,n < N are natural numbers. 

In the following, “atomic constraint” always means “atomic constraint over C 
with ceiling N” . Note that given two timed automata there are only finite number 
of such atomic constraints. We shall use c to range over atomic constraints. 

A constraint, or zone, is a boolean combination of atomic constraints. A 
constraint (j> is consistent if there is some p such that p \= 4>. Let (p and if be two 
constraints. We write <f \= if to mean p \= <f implies p\= if for any p. Note that 
the relation |= is decidable. 

^ This does not put any restriction on our results, because we can always rename clock 
variables of an automaton without affecting its behaviour. 
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A region constraint, or region for short, over n clock variables x\, . . . , is 
a consistent constraint containing the following atomic conjuncts: 

— For each i G {1, ... ,n} either Xi = irii or irii < Xi < mi + I or Xi > N ; 

— For each pair of i,j G n}, i ^ j, such that both Xi and Xj are 

not greater than N, either Xi — mi = xj — mj or Xi — mi < Xj — mj or 

Xj — mj < Xi — mi- 

where the mi in Xi — mi of the second clause refers to the mi related to Xi in 
the first clause. In words, mi is the integral part of Xi and Xi — mi its fractional 
part. 

Given a finite set of clock variables C and a ceiling N , the set of region 
constraints over C is finite and is denoted TZC%. In the sequel we will omit the 
sub- and super-scripts when they can be supplied by the context. 

Fact 1 Suppose that 4> is a region constraint and ip a zone. Then either (p ^ ip 
or (p ^ ~^ip. 



So a region is either entirely contained in a zone, or is completely outside a zone. 

A canonical constraint is a disjunction of regions. Given a constraint we can 
first transform it into disjunctive normal form, then decompose each disjunct 
into a disjoint set of regions. Both steps can be effectively implemented. As a 
corollary to Fact 1, if we write TZC{(p) for the set of regions contained in the zone 
(p, then V TZC{(p) = (p, i.e. \J TZC{(p) is the canonical form of (p. 

We will need two (postfixing) operators J,x and fl- to deal with resetting 
and time passage. Here we only define them semantically. The syntactical and 
effective definitions can be found in the full version of the paper. 

For any (p, let (pl^ = { p{x := 0} | p |= and <p-i\ = {p + d\ p\=<p, d G 
R-° }. Gall a constraint (p fi-closed if (pP[ = (p. It is easy to see that <pP[ is fi-closed, 
and if ^ is a region constraint then so is ^ix- 

Symbolic bisimulation will be defined as a family of binary relations indexed 
by clock constraints. Following [Ger92] we use constraints over the union of the 
(disjoint) clock sets of two timed automata as indices. Given a constraint <p, 
a finite set of constraints is called a (p-partition \i \/ = <p. A ^partition 

is called finer than another such partition T \i can be obtained from W 
by decomposing some of its elements. By the corollary to Fact 1, TZC{(p) is a (p- 
partition, and is the finest such partition. In particular, if (() is a region constraint 
then {<p} is the only partition of (p. 



Definition 3.1. A constraint indexed family of symmetric relations over terms 
S = { I ^ fi— closed} is a symbolic timed bisimulation if (t,u) G S’^ implies 

1. (p \= Invff) Inv{u) and 

2. whenever t f then there is an {Inv{t) A (p A ip) -partition such that for 



, , j . -Ip' ,a,y 

each 4> G 0 there is u — ^ 
{t',u') G 



u' for some ip' , y and u' such that (p' ip' and 



We write t u if ft, u) G S’^ and S‘^ G S for some symbolic bisimulation S. 
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S1X + 0 = A S2X + X^X 

S3 A + y = y + A S4 (X + Y) + Z ^ X + {Y + Z) 

Fig. 3. The Equational Axioms 



Symbolic timed bisimulation captures ~ in the following sense: 

Theorem 3.2. For closed 4>, t u iff tp ~ up for any p \= (j) A Inv{f) A 
Inv{u). 

4 The Proof System 

The proposed proof system consists of a set of equational axioms in Figure 3 and 
a set of inference rules in Figure 4 where the standard rules for equational rea- 
soning have been omitted. The judgments of the inference system are conditional 
equations of the form 

4>\> t = u 

where (f> is a constraint and t, u terms. Its intended meaning is “t u”, or 
“tp ~ up for any p \= (p A Inv{t) A Inv{u)'" . tt > t = u will be abbreviated as 
t = u. 

The axioms are the standard monoid laws for bisimulation in process alge- 
bras. More interesting are the inference rules. For each construct in the language 
there is a corresponding introduction rule. CHOICE expresses the fact that timed 
bisimulation is preserved by -I-. The rule GUARD permits a case analysis on 
conditional. The rule INV deals with invariants. It also does a case analysis and 
appears very similar to GUARD. However, there is a crucial difference: When 
the guard ip is false ip^t behaves like 0, the process which is inactive but can 
allow time to pass; On the other hand, when the invariant ip is false {ip}t behaves 
like {ff}0, the process usually referred to as time-stop, which is not only inac- 
tive but also “still”, can not even let time elapse. ACTION is the introduction 
rule for action prefixing (with clock resetting). The THINNING rule allows to 
introduce/remove redundant clocks. REG is the usual rule for folding/unfolding 
recursions, while UFI says if A is guarded in u then fixArt is the unique solution 
of the equation X = u. UNG can be used to remove unguarded recursion. Finally 
the two rules PARTITION and ABSURD do not handle any specific constructs 
in the language. They are so-called “structural rules” used to “glue” pieces of 
derivation together. 

Let us write \~ p\> t = u to mean <p\> t = u can be derived from this proof 
system. 

Some useful properties of the proof system are summarised in the following 
proposition: 

Proposition 4.1. 1. h p^ptp^t) = p A ip^t 

2. A t = t-\- p^t 

3. If p\= p then \~ p\>t = p^t 
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. (f) f\ Ip \> t = u <p f\ ^%p \> 0 = u 

LjUAHJJ ; ; 

(p > Ip^t = U 

> /\%p \> t = u (p A -^tp \> {ff}0 = u 



INV- 



THINNING 



(p > {i^}t = u 
a(xy).t = a{x.).t 



ynC{t) = I 



CHOICE 
ACTION 
REC 



i\> t — t' 



(pt>t + U = t'+U 
l> t = M 

(p > a{x).t — a(x).M 



RxXt = t[RxXt/X] 



UFI ^ X guarded in u 
t = fixAu 

PARTITION 



UNG 



fixX(X + t) = fixXi 



\> t — U p2\> t — U 



>\>t = U 



cp'^<P^y<P2 ABSURD 



ff > i = M 



Fig. 4. The Inference Rules 



4- \-<pA%p\>t = u implies \~ (p\> ip^t = 

5. h (p^{t + u) = + p^u 

6. h p^t + = py p^t 

The following lemma shows how to “push” a condition through an action 
prefix. It can be proved using ACTION, INV and Proposition 4.1.5. 

Lemma 4.2. \- p\> a{yi).{p}t = a{x.).{p}plxi\^t. 

The UFI rule, as presented in Figure 4, is unconditional. However, a con- 
ditional version can be easily derived from Proposition 4.1.4, REC and UFI: 



Proposition 4.3. Suppose X is guarded in u. Then h p\>t = u[p^t/X] implies 
\- p\>t = &:s.X{p^u). 

The rule PARTITION has a more general form: 

Proposition 4.4. Suppose T is a p-partition and \~ p \> t = u for each p G T, 
then \- p \> t = u. 

Soundness of the proof system is stated below: 

Theorem 4.5. If \~ p \> t = u and p is p-closed then tp ~ up for any p ^ 
p A Inv{t) A Inv{u). 

The standard approach to the soundness proof is by induction on the length 
of derivations, and perform a case analysis on the last rule/ axiom used. How- 
ever, this does not quite work here. The reason is that the definition of timed 
bisimulation requires two processes to simulate each other after any time delays. 
To reflect this in the proof system, we apply the 'f|' operator, after J, for clock 
resetting, in the premise of the ACTION rule. But not all the inference rules 
preserve the 'fl'-closeness property. An example is GUARD. In order to derive 
p > p^t = u, we need to establish p Ap t> t = u and p A -•p t> 0 = u. Even if p 
is 'fl'-closed, p Ap may not be so. 

To overcome this difficulty, we need a notion of “timed bisimulation up to a 
time bound”, formulated as follows: 
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Definition 4.6. Two processes p and q are timed bisimular up to do G R-°, 
written p q, if for any d such that 0 < d < do 

— whenever p — ^ p' then q — ^ q' for some q' and p' ~ q' 

(and symmetrically for q), where p ^ q is defined thus 

— whenever p — ^ p' then q — ^ q' for some q' and p' ~ q' 

(and symmetrically for q). 

Note that ~ is the same as and in general. 

Now the following proposition, of which Theorem 4.5 is a special case when 
(j) is 'fl'-closed, can be proved by standard induction on the length of derivations : 

Proposition 4.7. If \~ (f l> t = u then tp up for any p and do such that 
p + d \= (j> A Inv{t) A Inv{u) for all 0 < d < do. 

5 Completeness 

This section is devoted to proving the completeness of the proof system which is 
stated thus: if t u then \- (f> t> t = u. The structure of the proof follows from 
that of [Mil84] . The intuition behind the proof is as follows: A timed automaton 
is presented as a set of standard equations in which the left hand-side of each 
equation is a formal process variable corresponding to a node of the automa- 
ton, while the right hand-side encodes the outgoing edges from the node. We 
first transform, within the proof system, both t and u into such equation sets 
(Proposition 5.1). We then construct a “product” of the two equation sets, rep- 
resenting the product of the two underlying timed automata. Because t and u 
are timed bisimilar over (j), each should also bisimilar to the product over (j). Us- 
ing this as a guide we show that such bisimilarity is derivable within the proof 
system, i.e. both t and u provably satisfy the product equation set (Proposi- 
tion 5.2). Finally we demonstrate that a standard set of equations has only one 
solution, therefore the required equality between t and u can be derived. The 
unique fixpoint induction is only employed in the last step of the proof, namely 
Proposition 5.3. 

Let X = {W \ i G 1} and W be two disjoint sets of process variables and x 
a set of clock variables. Let also Ui, i G I, he terms with free process variables 
in X U W and clock variables in x. Then 

E : {Xi = Ui \ i G 1} 

is an equation set with formal process variables X and free process variables in 
W. if is closed if W = 0. if is a standard equation set if each Ui has the form 

w.}( E 4^ik ^O^ikip^ik') f(i^k) ^ ^ '4^ik' f' (i.,k')'} 

kGKi k'GK'. 
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A term t provably (/)-satisfies an equation set E if there exist a vector of terms 
{ti \ i G I}, each ti being of the form and a vector of conditions { (j)i \ 

i & 1} such that (/)i = (/), h (/) > = t, |= 'ipl, and 

h (/)* > / x^\i e /] 

for each i G I. We will simply say “t provably satisfies A” when (j)i = tt for all 
i e I. 

Proposition 5.1. For any term t with free process variables W there exists a 
standard equation set E, with free process variables in W, which is provably 
satisfied by t. In particular, if t is closed then E is also closed. 

Proof. We first show that, by using UNG, for any term t there is a guarded term 
t' such that FV{t) = FV{t') and \- t = t' . The proposition can then be proved 
by structural induction on guarded terms. 



Proposition 5.2. For guarded, closed terms t and u, ift u then there exists 
a standard, closed equation set E which is provably (j>-satisfied by both t and u. 

Proof. Let the sets of clock variables of t, u be x, y, respectively, with xHy = 0. 
Let also Ex and E 2 be the standard equation sets for t and u, respectively: 

E\ . { Alj — ^ ^ f{i.k) I ^ ^ } 

keKi 

E 2 : {Yj = {fjj} | j G J} 

l&Lj 

So there are U = {(j)'i}ti, uj = {tpj}u'j with \- ti = t, \- m = u such that 
1= (j)i (/)j, \= Ip i ip^, and 

b ^ ^ f^ik ^^iki^ik)'tf(i^k) b Uj = ^ ^ 

k&Ki iGLj 

Without loss of generality, we may assume Oik = bji = a for all i, k,j, 1. 

For each pair of f, j, let 

= { Z\ G T^C(xy) I U } 

Set (pij = V <Pij . By the definition of Fij , (pij is the weakest condition over 
which ti and Uj are symbolically bisimilar, that is, ip (pij for any ip such that 
ti Uj. Also for each A G Fij, A ^ Inv{ti) Inv{uj), i.e., A \= (pi ip'j, 
hence A \= (pi ^ tpj. 

For each A G Fij let Ifj = { {k,l) \ tf^i^k) Ug(jj) }. Define 

E : Zij = {(pi\{ ^ ^ A > ^ ^ ^i^ikY jl)-Y f{i^k)g(j,l)) 
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We claim that E is provably ^satisfied by t when each Zij is instantiated 
with ti over (pij. We need to show 

Since the elements of are mutually disjoint, by Propositions 4.4 and 4.1, it 
is sufficient to show that, for each A € <Pij, 

\~ A\> ti = {4>i} ^ ^ 



By the definition of I^, we have t/(i,fc) Ug(j^i). Hence, from the 

definition of <l^f(t,k)g( 3 ,i)Uikyii'^- Therefore 



A\> 


{fi} 


E 










(k,l)€I^ 






Lemma 4.2 


{4>i} 


E 


^{'^iky jl) ■{(!> f (i,k)} jlit ^^f(i,k)g(j,l)lx.if.yji fp ^~^f(i,k)) 






(k,l)€I^ 






Prop. 4.1 


{4>i} 


E 








(k,l)€I^ 






Lemma 4.2 


{4>i} 


E 


^{^iky jl) -{(p f (i^k) }if(i,k) 








(k,l)€I^ 






THINNING 


{4>i} 


E 


. . S1-S4 . 


{^ik')-if(i,k) — 






(k,l)€I^ 


keKi 





Symmetrically we can show E is provably ^satisfied by u when Zij is in- 
stantiated with Uj over 4>ij. 

Proposition 5.3. If both t and u provably <f>-satisfy standard equation set E 
then \- 4> \> t = u. 

Proof. By induction on the size of E. 

Combining Propositions 5.1, 5.2 and 5.3 we obtain the main theorem: 
Theorem 5.4. If t u then \~ 4> \> t = u. 

6 Conclusion and Related Work 

We have presented an axiomatisation, in the form of an inference system, of 
timed bisimulation for timed automata, and proved its completeness. To the 
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best of our knowledge, this is the first complete axiomatisation for the full set 
of timed automata. As already mentioned in the introduction, the precursor to 
this work is [LWOO.l], in which an inference system complete over the loop-free 
subset of timed automata is formulated. The key ingredient of the current exten- 
sion is unique fixpoint induction. Although the form of this rule is syntactically 
the same as that used for parameterless processes [Mil84], here it is implicitly 
parameterised on clock variables, in the sense that the rule deals with terms 
involving clock variables which do not appear explicitely. 

The most interesting development so far in algebraic characterizations for 
timed automata are presented in [ACM97, BP99]. As the main result, they es- 
tablished that each timed automaton is equivalent to an algebraic expression 
built out of the standard operators in formal languages, such as union, intersec- 
tion, concatenation and variants of Kleene’s star operator, in the sense that the 
automaton recognizes the same timed language as denoted by the expression. 
However, the issue of axiomatisation was not considered there. In [DAB96] a 
set of equational axioms was proposed for timed automata, but no complete- 
ness result was reported. [HS98] presents an algebraic framework for real-time 
systems which is similar to timed automata where “invariants” are replaced by 
“deadlines” (to express “urgency”), together with some equational laws. Apart 
from these, we are not aware of any other published work on axiomatising timed 
automata. On the other hand, most timed extensions of process algebras came 
with equational axiomatisations. Of particular relevance are [Bor96] and [AJ94]. 
The former developed a symbolic theory for a timed process algebra, while the 
later used the unique fixpoint induction to achieve a complete axiomatisation 
for the regular subset of the timed-CCS proposed in [Wan91]. 
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Abstract. In this paper, we investigate a text sparsification technique 
based on the identification of local maxima. In particular, we first show 
that looking for an order of the alphabet symbols that minimizes the 
number of local maxima in a given string is an Np-hard problem. Suc- 
cessively, we describe how the local maxima sparsification technique can 
be used to filter the access to unstructured texts. Finally, we experimen- 
tally show that this approach can be successfully used in order to create 
a space efficient index for searching a DNA sequence as quickly as a full 
index. 



1 Introduction 

The Run Length Encoding (in short, RLE) is a well-known lossless compression 
technique [7] based on the following simple idea: a sequence of k equal characters 
(also called run) can be encoded by a pair whose first component is the character 
and whose second component is k. RLE turns out to be extremely efficient in 
some special cases: for example, it can reach a 8-to-l compression factor in the 
case of scanned text. Moreover, it is also used in the JPEG image compression 
standard [10]. We can view RLE as a (lossless) text sparsification technique. 
Indeed, let us imagine that the position of each character of a text is an access 
point to the text itself from which it is then possible to navigate either on the 
left or on the right. The RLE technique basically sparsities these access points 
by selecting only the first position of each run. 

Another well known form of text sparsification applies to structured texts, 
that is, texts in which the notion of word is precisely identifiable (for example, 
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by means of delimiter symbols). In this case, a natural way of defining the access 
points consists of selecting the position of the first character of each word (for 
example, each character following a delimiter symbol). Differently from RLE, this 
technique does not allow us to obtain a lossless compression of the original text 
but it can be very useful in order to create a space efficient index for searching 
the text itself. For example, if we are looking for a given word X, this technique 
allows us (by means of additional data structures such as suffix arrays) to analyze 
only the words starting with the first character of X. The main drawback of this 
approach is that it does not generalize to unstructured texts, that is, texts for 
which there is no clear notion of word available (for example, DNA sequences). 

The main goal of this paper is to study an alternative text sparsification 
technique, which might be used to create space efficient indices for searching 
unstructured texts. In particular, we will consider the following technique. Given 
a string X over a finite alphabet E, let us assume that a total order of the 
symbols in E is specified. We then say that an occurrence of a symbol a; in A is 
an access point to A if a; is a local maximum; that is, both symbols adjacent to 
X are smaller than x. As in the case of the previous technique based on words, 
this sparsification technique is lossy. Indeed, assume that x and y are two local 
maxima: from this information, we can deduce that the symbol immediately 
after x is smaller than x and the symbol immediately before y is smaller than 
y. Of course, in general this information is not sufficient to uniquely identify the 
sequence of symbols between x and y. 

Nevertheless, the notion of local maxima has been proven very useful in string 
matching [1, 4, 8, 9] and dynamic data structures [6, 9] as an extension of the 
deterministic coin tossing technique [3] . It is well understood in terms of local 
similarities, by which independent strings that share equal portions have equal 
local maxima in those portions. In this paper we will consider the following two 
questions. 

— How much can a given text he sparsified by applying the local maxima tech- 
nique? Note that this question is different from the ones previously studied 
in the literature on local maxima, as we would like to minimize the num- 
ber of local maxima while previous results aimed at minimizing the distance 
between consecutive maxima. 

In order to answer this question, we will introduce the following combinatorial 
problem: given a text of length n over an alphabet E, find an order of E which 
minimizes the number of local maxima (that is, the number of access points). 
We will then prove that this problem is Np-hard (see Sect. 2) for non-constant 
sized alphabets (clearly, the problem can be solved in time 0{\E\ln) that, for 
constant sized alphabets, is 0{n)). 

— Can the local maxima sparsification technique he used to filter the access to 
unstructured texts in practice? 

In order to answer this question we first describe how the technique can be 
used to create an index for searching a given text (see Sect. 3). We will then give 
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a positive answer to the above question by experiments, in the case of texts that 
are DNA sequences (see Sect. 4). In particular, we show that each run of the 
sparsification algorithm reduces the number of maxima by a factor of three. We 
exploit this to create a space efficient index for searching the sequence as quickly 
as a full index by means of additional data structures such as suffix arrays. 

2 The NP-Hardness Result 

In this section, we first describe the combinatorial problem associated with the 
local maxima sparsification technique and we then show that this problem is 
Np-hard. 



2.1 The Combinatorial Problem 

Let A = • • • Xn be a string over a finite alphabet S and assume that an order 

7T of the symbols in S (that is, a one-to-one function tt from S to {1, . . . , IL"!}) 
has been fixed. The local maxima measure A4(A, tt) of X with respect to tt is 
then defined as the number of local maxima that appear in X, that is, 

M{X, 7t) = |{f : (1 < i < n) A (7r(a:i_i) < Ti{xi)) A {Tr{xi) > 7r(a;i+i))}|. 

The Minimum Local Maxima Number decision problem is then defined as 
follows: 

Instance A string X over a finite alphabet E and an integer value K. 
Question Does there exist an order tt of A such that Ai{X, tt) < K1 

Clearly, Minimum Local Maxima Number belongs to Np (since we just have 
to non-deterministically try all possible orders of the alphabet symbols). 



2.2 The Reduction 

We now define a polynomial-time reduction from the Maximum Exactly-Two 
Satisfiability decision problem to Minimum Local Maxima Number. Re- 
call that Maximum Exactly-Two Satisfiability is defined as follows: given 
a set of clauses with exactly two literals per clause and given an integer H, does 
there exist a truth-assignment that satisfies at least H clauses? It is well-known 
that Maximum Exactly-Two Satisfiability is Np-complete [5]. 

The basic idea of the reduction is to associate two symbols with each variable 
and one symbol with each clause and to force each pair of variable-symbols 
to be either smaller or greater than all clause-symbols. The variables whose 
both corresponding symbols are greater (respectively, smaller) than the clause- 
symbols will be assigned the true (respectively, false) value. The implementation 
of this basic idea will require several additional technicalities which are described 
in the next two sections. 
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The instance mapping. Let C = {ci, . . . , Cm} be a set of clauses over the set of 
variables U = {u\, . . . ,m„} such that each clause contains exactly two literals, 
and let iL be a positive integer. The alphabet S{C) of the corresponding instance 
of Minimum Local Maxima Number contains the following symbols: 

— Two special symbols (Tm and CTm (which, intuitively, will be the extremal 
symbols in any “reasonable” order of the alphabet). 

— For each variable Ui with 1 < i < n, two symbols cr“ and <j“. 

— For each clause Cj with I < j < m, a symbol a^. 

The string X{C) of the instance of Minimum Local Maxima Number is 
formed by several substrings with different goals. In order to define it, we first 
introduce the following gadget: given three symbols a, 5, and c, let (/(a, b, c) = 
abcbcba. The next result states the basic property of the previously defined gad- 
get. 

Lemma 1. Let S he an alphabet and let a, b, and c three symbols in S . For 

any order tt of E and for any integer r > 0, the following hold: 

1. If 7t(c) < 7t(6) < 7r(a), then Ai(g(a, b, c)”, tt) = 2r — 1. 

2. If TT{a) < 7t(6) < 7t(c), then M.{g{a,b,cY , tt) = 2r. 

3. If none of the previous two cases applies, then M.{g{a, 6, c)”, tt) > 3r — 1. 

Proof. The proof of the lemma is done by examining all possible cases. Indeed, 
in Table I the occurrences of maxima produced by one gadget in correspondence 
of the six possible orders of the three symbols a, b, and c are shown. 
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Table 1. The possible orders of the gadget symbols 



By looking at the table, it is easy to verify the correctness of the three 
statements of the lemma. □ 

The first to -I- 2n substrings of the instance X will force and CTm to be the 
extremal symbols of any efficient order of E. They are then defined as follows: 

- For j = 1, . . . ,TO, X( = 

- For i = l,...,n, X^ = g((Tm, ct“, 

- For f = 1, . . . ,n, = 5 (crm,(J",crM)""^ 
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The next nm substrings will force each pair of variable symbols to be either 
both on the left or both on the right of all clause symbols. In particular, for 
t = 1, . . . , n and for j = 1, . . . , m, we define 

Yl = (cr„,5(cr“,cr“,cr|)crM)™". 

Finally, for each clause Cj with 1 < j < m, we have one substring whose 

definition depends on the type of the clause and whose goal is to decide the 

truth-value of each variable depending on its symbol’s position relatively to the 
clause symbols. In particular: 

- If Cj = UiV Uk, then Zj = 

- If Cj = -~Ui V Uk, then = CrmCr“(T^(T“(T^(T^(Tm. 

- If Cj = V -^Uk, then Zj = 

- If Cj = -^Ui V ^Uk, then Zj = crmCrl^crjCr^crjCrMaa- 

In conclusion, the instance X is defined as: 

\r V'l \rm V'l VTI \ZYn ry ry 

A — y\.i ■ ■ ■ J\2 ''' ^2 ^3 ''' ^3 ^ 1 ■■'-'1 ^m- 

The proof of correctness We now prove the following result. 

Lemma 2. For any set C of m clauses with exactly two literals per clause and 
for any integer FI , there exists a truth- assignment that satisfies H clauses in C 
if and only if there exists an order tt of E{C) such that 

M{X{C), 7t) = (2to -I- 7n)m^ 3m — FI. 

Proof. Assume that there exists a truth-assignment t to the variables rti , . . . , 
that satisfies FI clauses in C . We then define the corresponding order tt of the 
symbols in S{C) as follows. 

1. For any symbol a different from <Jm and a^, 7r(<Jm) < 7r(a) < 7r(<TM). 

2. For any i with 1 < t < n and for any j with 1 < j < m, if t(ui) = false, 
then 7r(cr“) < 7r(cr“) < (j|, otherwise cr| < 7r((r") < 7r(cr“). 

From Lemma 1 it follows that 

M{Xl ■ ■ ■ X^Xi ■ ■ ■ X^X^ • • • A”, 7t) = (2m -f 4n)m^ 

Moreover, because of the same lemma and since ends with the maximal 
symbol, we have that 

M(X^ ■ ■ ■ Xf^X^ ■ ■ ■ X^X^ ■ ■ ■ X^Y^ ■ ■ ■ tt) = (2m -f An)m^ + Snm^ - 1 

= (2m -I- 7n)m^ — 1. 

The concatenation of Yff' with produces one more maximum. Successively, 
the number of maxima will depend on whether one clause is satisfied: indeed, 
it is possible to prove that if the jth clause is satisfied, then Zj produces two 
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Table 2. The occurrences of maxima corresponding to Ui V Uk 



maxima, otherwise it produces three maxima. For example, assume that Cj is the 
disjunction of two positive literals UiW Uk- The occurrence of maxima according 
to the truth-values of the two variables is then shown in Table 2 (recall that by 
definition of tt, for any h = 1, . . . ,n and for any ? = 1, . . . , m, 7r(crj() < 7r((jf) if 
and only if r(cr^) = false, and that, in this case, Zj = crmCr|cr“(T|(T^(T|crm). As it 
can be seen from the table, we have two maxima in correspondence of the three 
truth-assignments that satisfy the clause and three maxima in the case of the 
non-satisfying assignment. The other types of clauses can be dealt in a similar 
way. 

In summary, we have that the number of maxima generated by tt on the string 
X{C) is equal to (2m -I- 7n)m^ + 2H + 3(m — H) = (2m -I- 7n)m^ + 3m — H. 

Conversely, assume that an order tt of the symbols in S{C) is given such that 
the number of maxima generated on X(C) is equal to (2m -I- 7n)m^ + 3m — H . 
Because of Lemma 1, the substring X\ ■ ■ ■ A("A 2 • • • X^X^ ■ ■ ■ X^ produces at 
least (2m-|-4n)m^— 1 maxima and it ensures that either is the minimal symbol 
and (Tm is the maximal one or is the minimal symbol and is the maximal one. 
Assume that the former case holds so that (2m-|-4n)m^ are generated (the latter 
case can be dealt in a similar way) . The next substring T/ • • • ■ ■ - YX ■ ■ Y^a^ 

instead produces at least 3nm^ maxima and it ensures that, for any i with 
1 < i < n and for any j with 1 < j < m, either 7r(cr“) < 7r(cr“) < 7r(cr|) or 
7r(crp < 7r(cr“) < 7r(cr“). We then assign the value true to variable Ui if the 
former case holds, otherwise we assign to it the value false. It is then easy 
to verify that the remaining 3m — H maxima are produced by H clauses that 
are satisfied and m — H clauses that are not satisfied. For example, assume 
that Cj = ^Ui V ^Uj so that Zj = crmCr“cr|cr^cr|fTMcrm. The truth-values of Ui and 
Uj corresponding to the six possible order of (t“, <t^, and cr| and the resulting 
truth-value of Cj along with the number of generated maxima are shown in 
Table 3. We have thus shown that H clauses of C can be satisfied if and only if 
M{X{C),tt) = (2m -I- 7n)m^ + 3m — H and the lemma is proved. □ 

From the above lemma and from the fact that Minimum Local Maxima 
Number belongs to Np it follows the following theorem. 



Theorem 1. Minimum Local Maxima Number is complete. 
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Table 3. The possible orders corresponding to Cj = ~^Ui V ~^Uk 



3 The Sparsification Algorithm 

We have seen in the previous section that the problem of assigning an ordering 
to the characters of a sequence which minimizes the number of local maxima is a 
hard problem. Clearly, for any fixed string, the number of local maxima produced 
by any ordering is at most half of the length of the string. The following lemma, 
instead, guarantees that, for any fixed ordering tt, the number of local maxima 
produced by tt, on a randomly chosen string, is at most one third of the length 
of the string. 

Lemma 3. Let tt he an order over an alphabet S. If X is a randomly chosen 
string over X of length n, then the expected value of M{X,tt) is at most n/3. 

Proof. Let X = xi ■■■ Xn he the randomly chosen string over X and let T{xk) 
be the random variable that equals to 1 if Xk is a maximum and 0 otherwise, for 
any k with 1 < A: < n. Clearly, for any k with 2 < A: < n — 1, 

Pr [T{xk) = 1] = Pr [7r(a;fc_i) < Tr{xk)] Pr [7r(a;fe+i) < Tr{xk)] ■ 

Hence, the probability that Xk is a maximum, assuming that Trfxk) = i, is 



Pr [T{xk) = l|7r(xfc) = i] = ^ Pr[7r(a:fc_i) = j] ^ Pr[7r(a;fe+i) = j] 

i=i i=i 

■ 

Finally, the probability that Xk is a maximum is 



Pr [T{xk) = 1] = ^Pr [T{xk) = l|7r(a;fe) = i] Pr [TT{xk) = i] 



12713 3 3 | 27 | 2 - 
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By linearity of expectation, the expected number of local maxima is 




and the lemma follows. □ 

The above lemma suggests that random strings (that is, strings which are not 
compressible) can be sparsified by means of the local maxima technique so that 
the number of resulting access points is at most one third of the length of the 
original string. We wish to exploit this property in order to design a sparsification 
procedure that replaces a given string with a shorter one made up of only the local 
maxima (the new string will not clearly contain the whole original information) . 
We repeat this simple procedure by computing the local maxima of the new 
string to obtain an even shorter string. We iterate this shortening several times 
until the required sparsification is obtained. That is, the compressed string is 
short enough to be efficiently processed, but it still contains enough information 
to solve a given problem, as we will see shortly. For example, let us consider 
the very basic problem of searching a pattern string in a text string. We can 
compress the two strings by means of our sparsification procedure. Then, we 
search for the pattern by matching the local maxima only. Whenever a match is 
detected, we check that it is not a false occurrence by a full comparison of the 
pattern and the text substring at hand. It is worth pointing that the number of 
times we apply the sparsification on the text must be related to the length of the 
patterns we are going to search for. Indeed, performing too many iterations of 
the sparsification could drop too many characters between two consecutive local 
maxima selected in the last iteration. As a result, we could not find the pattern 
because it is too short (see Lemma 4). 

Another care must be taken with alphabets of small size, such as binary 
strings or DNA sequences. At each iteration of the algorithm, at least one char- 
acter of the alphabet disappears from the new string, since the smallest character 
in the alphabet is not selected as local maximum. This fact can be a limitation, 
for instance in DNA sequences, where \S\ is only 4. Indeed, we can apparently 
apply the sparsification less than lAj times. We can circumvent this problem by 
storing each local maximum along with its offset to (i.e., the number of charac- 
ters before) the next maximum. Each local maximum in E is replaced by a new 
character given by the pair (local maximum, offset) in the new alphabet E x N 
undergoing the lexicographic order. 

In order to explain the sparsification algorithm, let us consider the following 
string over the alphabet E of four characters A, C, G, and T: 

To=TGACACGTGACGAGCACACACGTCGCAGATGCATA. 

Assuming that the characters are ordered according to the lexicographical or- 
der, the number of local maxima contained in the above string is 11 (which 
is approximately one third of the total length of the string, i.e., 35). The new 
string obtained after the first iteration of the sparsification algorithm is then the 
following one: 
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Ti=(C,4) (T,4) (G,2) (G,3) (C,2) (C,4) (T,2) (G,3) (G,2) (T,4) (T,2). 

Observe that the new alphabet consists of six characters, each one composed 
by a character oi S — {A} and a natural number in {2,. . . ,4}. Assuming that 
these characters are ordered according to the lexicographical order, we have 
that the number of local maxima contained in the above string is 4 (which is 
approximately one third of the total length of the string, i.e., 11). The new 
string obtained after the second iteration of the sparsification algorithm is then 
the following one: 



T2=(T,6) (G,9) (T,7) (T,6). 

Assume we are looking for the pattern ACACGTGACGAGCA which occurs in Tq start- 
ing at position 3. By applying the first iteration of the sparsification algorithm 
to the pattern, we obtain the string (C,4) (T,4) (G,2) (G,3) which occurs in Ti 
starting at position 1. However, if we apply the second iteration of the sparsi- 
fication algorithm to the pattern, we obtain the new pattern (T,9) which does 
not occur in T 2 - Indeed, as we have already observed, the size of the pattern to 
be searched bounds the number of the iterations of the algorithm that can be 
performed. Formally, let Ti, for z > 1, be the text after the zth iteration and let 
rrii be the maximum value of the offset of a local maxima in Ti. It can be easily 
verified the following: 

Lemma 4. A pattern P of size m is successfully found in a text Ti, as long as 
m > 2rrii. 

In the previous example, we have that m = 14, mi = 4, and m 2 = 9. According to 
the above lemma, the pattern is successfully found in Ti but it is not successfully 
found in T 2 . 

4 Experimental Results 

In our experiments, we consider DNA sequences, where S = {A, T, C,G}. In 
Table 4, we report the number of local maxima obtained for the three DNA 
sequences: Saccharomyces Cervisiae (file IV.fna), Archeoglobus Fulgidus (file 
aful . fna) and Escherichia Coli (file ecoli . fna), for three consecutive iterations 
of the algorithm. 

In the zth iteration, z = 1, . . . , 3, we have Ui local maxima with maximum 
distance rrii among two consecutive of them. The values of rrii is not exactly 
the maximum among all possible values. There are very few values that are 
very much larger than the majority. The additive term in the figures for Ui 
accounts for those local maxima that are at distance greater than m^. For ex- 
ample, after the first iteration on Saccharomyces Cervisiae (file IV.fna), there 
are Ui = 459027 local maxima having offset at most mi = 18, and only 616 
local maxima with offset larger than 18 (actually, much larger). It goes without 
saying that it is better to treat these 616 maxima independently from the rest 
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m2 


ns 
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IV.fna 
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459027+616 


18 


146408+1814 


37 


47466+6618 


92 


aful.fna 


2178460 


658396+101 


13 


213812+952 


35 


69266+3781 


94 


ecoli .fna 


4639283 


1418905+61 


14 


458498+851 


37 


148134+4572 


98 



Table 4. Sample values for three DNA sequences 



of the maxima. Finally, we observe a reduction of about 1/3 at each iteration on 
the values of Ui. 

In Fig. 1, we report the distribution of the distances between consecutive 
maxima in the sequence after each of three iterations of the sparsification algo- 
rithm. After the first iteration almost all the values are concentrated in a small 
range of values (see Fig. la); the distribution curve is maintained and flattened 
after the next two iterations (see Fig. Ib-c). 
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Fig. 1. Distribution of distances among the local maxima for file ecoli.fna. 
Data for iteration z = 1 is reported in (a) where distances range from 1 to 16, 
for i = 2 in (b) where distances range from 3 to 46, and for i = 3 in (c) where 
distances range from 8 to 144. 



As a result of the application of our sparsification technique to construct a 
text index on the suffixes starting at the local maxima (for this purpose, we use 
a suffix array in our experiments) , the occupied space is small compared to the 
text size itself, and this seems to be a rather interesting feature. Here, we are 
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considering the exact string matching problem. The application of our method 
to other important string problems, such as multiple sequence alignment and 
matching with errors, seems promising but it is still object of study. As the 
search of patterns in DNA applications has to be performed considering the 
possibility of errors, one should use the approximate, rather than the exact 
string matching. However, in several algorithms used in practice, findings the 
exact occurrences of the pattern in the text [2] is a basic filtering step towards 
solving the approximate problem due to the large size of the text involved. 

Three final considerations are in order. First, the threshold of 2m,i on the 
minimum pattern length in Lemma 4 is overly pessimistic. In our experiments, 
we successfully found all patterns of length at least mi/2. For example, in file 
ecoli.fna, we had m3 = 98. We searched for patterns of length ranging from 
50 to 198, and all searches were successful. 

Second, it may seem that a number of false matches are caused by discarding 
the first characters in the pattern in each iteration of the sparsification algo- 
rithm. Instead, the great majority of these searches did not give raise to false 
matches due to the local maxima, except for a minority. Specifically, on about 
150,000 searches, we counted only about 300 searches giving false matches, and 
the average ratio between good matches and total matches (including false ones) 
was 0.28. 

Finally, the most important feature of the index is that it saves a lot of space. 
For example, a plain suffix array for indexing file ecoli . f na requires about 17.7 
megabytes. Applying one iteration of the sparsification algorithm reduces the 
space to 5.4 megabytes, provided that the pattern length m is at least 14; the 
next two iterations give 1.8 megabytes (for m > 37) and 0.6 megabytes (for 
m > 98), respectively. These figures compare favorably with the text size of 1.1 
megabytes by encoding each symbol with two bits. The tradeoff between pattern 
length and index space is inevitable as the DNA strings are incompressible. 

5 Conclusion and Open Questions 

In this paper, we have investigated some properties of a text sparsification tech- 
nique based on the identification of local maxima. In particular, we have shown 
that looking for the best order of the alphabet symbols is an Np-hard problem. 
Successively, we have described how the local maxima sparsification technique 
can be used to filter the access to unstructured texts. Finally, we have experi- 
mentally shown that this approach can be successfully used in order to create a 
space efficient index for searching a DNA sequence as quickly as a full index. 

Regarding the combinatorial optimization problem, the main question left 
open by this paper is whether the optimization version of Minimum Local 
Maxima Number admits a polynomial-time approximation algorithm. It would 
also be interesting to accompany the experimental results obtained with DNA 
sequences by some theoretical results, such as the evaluation of the expected 
maximal distance between two local maxima or the expected number of false 
matches. 
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Abstract. Let a text string T of n symbols and a pattern string P of 
m symbols from alphabet S be given. A swapped version P' of P is a 
length m string derived from P by a series of local swaps, (i.e. Pi <— pe+i 
and pj_|_i <— pe) where each element can participate in no more than one 
swap. The Pattern Matching with Swaps problem is that of finding all 
locations i of T for which there exists a swapped version P' of P with 
an exact matching of P' in location i of T. 

Recently, some efficient algorithms were developed for this problem. 
Their time complexity is better than the best known algorithms for 
pattern matching with mismatches. However, the Approximate Pattern 
Matching with Swaps problem was not known to be solved faster than 
the pattern matching with mismatches problem. 

In the Approximate Pattern Matching with Swaps problem the output is, 
for every text location i where there is a swapped match of P, the number 
of swaps necessary to create the swapped version that matches location 
i. The fastest known method to-date is that of counting mismatches and 
dividing by two. The time complexity of this method is 0(ri\/m log m) 
for a general alphabet S. 

In this paper we show an algorithm that counts the number of swaps 
at every location where there is a swapped matching in time 
O (n log m logo-), where a = min(m,\S\). Consequently, the total time 
for solving the approximate pattern matching with swaps problem is 
0{f{n, m) -I- nlogmlogcr), where f{n, m) is the time necessary for solv- 
ing the pattern matching with swaps problem. 

Key Words: Design and analysis of algorithms, combinatorial algo- 
rithms on words, pattern matching, pattern matching with swaps, non- 
standard pattern matching, approximate pattern matching. 
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1 Introduction 

The Pattern Matching with Swaps problem (the Swap Matching problem, for 
short) requires finding all occurrences of a pattern of length m in a text of length 
n. The pattern is said to match the text at a given location i if adjacent pattern 
characters can be swapped, if necessary, so as to make the pattern identical to 
the substring of the text starting at location i. All the swaps are constrained to 
be disjoint, i.e., each character is involved in at most one swap. 

The importance of the swap matching problem lies in recent efforts to understand 
the complexity of various generalized pattern matching problems. The textbook 
problem of exact string matching that was first shown to be solvable in linear 
time by Knuth, Morris and Pratt [10] does not answer the growing requirements 
stemming from advances in Multimedia, Digital libraries and Computational Bi- 
ology. To this end, pattern matching has to adapt itself to increasingly broader 
definitions of “matching” [18, 17]. In computational biology one may be inter- 
ested in finding a “close” mutation, in communications one may want to adjust 
for transmission noise, in texts it may be desirable to allow common typing er- 
rors. In multimedia one may want to adjust for lossy compressions, occlusions, 
scaling, affine transformations or dimension loss. 

The above applications motivated research of two new types - Generalized Pat- 
tern Matching, and Approximate Pattern Matching. In generalized matching the 
input is still a text and pattern but the “matching” relation is defined differently. 
The output is all locations in the text where the pattern “matches” under the 
new definition of match. The different applications define the matching relation. 
An early generalized matching was the string matching with don’t cares problem 
defined by Fischer and Paterson [8]. Another example of a generalized matching 
problem is the less-than matching [4] problem defined by Amir and Farach. In 
this problem both text and pattern are numbers. One seeks all text locations 
where every pattern number is less than its corresponding text number. Amir 
and Farach showed that the less-than-matching problem can be solved in time 
0{n^/mlogm). 

Muthukrishnan and Ramesh [15] prove that practically all general matching 
relations, where the generalization is in the definition of single symbol matches, 
are equivalent to the boolean convolutions, i.e. it is unlikely that they could be 
solved in time faster than O(nlogm), where n is the text length and m is the 
pattern length. As we have seen, some examples have significantly worse upper 
bound than this. 

The swap matching problem is also a generalized matching problem. It arises 
from one of the edit operations considered by Lowrance and Wagner [14, 19] to 
define a distance metric between strings. 

Amir et al [3] obtained the first non-trivial results for this problem. They showed 
how to solve the problem in time log m log cr), where a = min(j A], m). 

Amir et al. [5] also give certain special cases for which 0(TOpolylog(m)) time can 
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be obtained. However, these cases are rather restrictive. Cole and Hariharan [6] 
give a randomized algorithm that solves the swap matching problem over a 
binary alphabet in time O(nlogn). 

The second important pattern matching paradigm is that of approximate match- 
ing. Even under the appropriate matching relation there is still a distinction be- 
tween exact matching and approximate matching. In the latter case, a distance 
function is defined on the text. A text location is considered a match if the dis- 
tance between it and the pattern, under the given distance function, is within 
the tolerated bounds. 

The fundamental question is what type of approximations are inherently hard 
computationally, and what types are faster to compute. This question motivated 
much of the pattern matching research in the last couple of decades. 

The earliest and best known distance function is Levenshtein’s edit distance [13]. 
The edit distance between two strings is the smallest number of edit operations, 
in this case insertions, deletions, and mismatches, whereby one string can be 
converted to the other. Let n be the text length and m the pattern length. A 
straightforward 0(nm) dynamic programming algorithm computes the edit dis- 
tance between the text and pattern. Lowrance and Wagner [14, 19] proposed an 
0{nm) dynamic programming algorithm for the extended edit distance prob- 
lem, where the swap edit operation is added. In [9, 11, 12] 0{kn) algorithms are 
given for the edit distance with only k allowed edit operations. Recently, Cole 
and Hariharan [7] presented an Oink‘d /m n-\- m) algorithm for this problem. 

Since the upper bound for the edit distance seems very tough to break, at- 
tempts were made to consider the edit operations separately. If only mismatches 
are counted for the distance metric, we get the Hamming distance, which defines 
the string matching with mismatches problem. A great amount of work was done 
on finding efficient algorithms for string matching with mismatches. By methods 
similar to those of Fischer and Paterson [8] it can be shown that the string match- 
ing with mismatches problem can be solved in time 0(min(|A|,m)nlogm). For 
given finite alphabets, this is O(nlogm). Abrahamson [1] developed an algorithm 
that solves this problem for general alphabets in time 0(n^/m logm). 

The approximate pattern matching with swaps problem considers the swaps as 
the only edit operation and seeks to compute, for each text location i, the number 
of swaps necessary to convert the pattern to the substring of length m starting 
at text location i (provided there is a swap match at i). In [2] it was shown 
that the approximate pattern matching with swaps problem can be reduced to 
the string matching with mismatches problem. For every location where there 
is a swap match, the number of swaps is precisely equal to half the number 
of mismatches (since a swap is two mismatches). Although swap matching as 
a generalized matching proved to be more efficient than counting mismatches, 
it remained open whether swap matching as an approximation problem can be 
done faster than mismatches. 
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In this paper we answer this question in the affirmative. We show that if all lo- 
cations where there is a swap match are known, the approximate swap matching 
problem can be solved in time 0{nlogmloga), where a = min(|i7|,m). There- 
fore, assuming swap matching can be done in time f{n,m), approximate swap 
matching can be done in time 0{f{n,m) + nlogmlogcr). 

Paper organization. This paper is organized in the following way. In section 2 
we give basic definitions. In sections 3, we outline the key idea and intuition 
behind our algorithm. In section 4 we give a randomized algorithm, which easily 
highlights the idea of our solution. It turns out that rather than using a generic 
derandomization strategy, a simple, problem specific, method can be used to 
obtain the deterministic counterpart. Section 5 presents an easy and efficient 
code that solves our problem deterministically. 



2 Problem Definition 

Definition: Let S' = si . . . s„ be a string over alphabet S. A swap permutation 
for S is a permutation tt : {1, . . . , n} ^ {1, . . . , n} such that 

1. if 7 t(z) = j then 7r(j) = i (characters are swapped). 

2. for all i, 7t(z) e {z — 1, z, z -I- 1} (only adjacent characters are swapped). 

3. if 7 t(z) yf z then s,r(i) Sj (identical characters are not swapped). 

For a given string S = si . . . s„ and swap permutation tt for S we denote 7 t(S) = 

• ■ • STr(n)- We Call 7 t(S) a swapped version of S. 

The number of swaps in swapped version 7r(S) of S is the number of pairs (z, z-l-1) 
where 7r(z) = z -I- 1 and n{i -|- 1) = z. 

For pattern P = pi . . .pm and text T = . . . t„, we say that P swap matches 

at location i if there exists a swapped version P' of P that matches T starting 
at location z, i.e. p' = for j = 1, . . . , m. It is not difficult to see that if P 

swap matches at location z there is a unique swap permutation for that location. 

The Swap Matching Problem is the following: 

INPUT: Pattern P = pi . . .pm and text T = t\...tn over alphabet S. 
OUTPUT: All locations z where P swap matches T. 

We note that the definition in [3] and the papers that followed is slightly different, 
allowing the swaps in the text rather than the pattern. However, it follows from 
Lemma 1 in [3] that both versions are of the same time complexity. 

The Approximate Swap Matching Problem is the following: 

INPUT: Pattern P = pi . . .pm and text T = t\...tn over alphabet E. 
OUTPUT: For every location z where P swap matches T, write the number of 
swaps in the swapped version of P that matches the text substring of length m 
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starting at location i. If there is no swap matching of P at i, write m + 1 at 
location i. 

Observation 1 Assume there is a swap match at location i. Then the number 
of swaps is equal to half the number of mismatches at location i. 



3 Intuition and Key Idea 

It would seem from Observation 1 that finding the number of swaps is of the 
same difficulty as finding the number of mismatches. However, this is not the 
case. Note that if it is known that there is a swap match at location i, this 
puts tremendous constraints on the mismatches. It means that if there is a 
mismatch between pattern location j and text location i + j — 1 we also know 
that either U+j-i = Pj-i or U^j-i = Pj+i- There is no such constraint in a 
general mismatch situation! 

Since we can “anticipate” for every pattern symbol what would be the mismatch, 
it gives us some flexibility to change the alphabet to reflect the expected mis- 
matches. Thus we are able to reduce the alphabet to one with a small constant 
size. For such alphabets, the Fischer and Paterson algorithm [8] allows counting 
mismatches in time O(nlogm). 

In order to be able to anticipate the mismatching symbol, we need to isolate every 
pattern symbol from its right and left neighbors. This can be done by splitting 
a pattern P to three patterns. Pi, P2 and P3, where each subpattern counts 
mismatches only in the central element of each triple. P\,P 2 and P3 represent 
the three different offsets of triples in the pattern. For a schema of these three 
patterns, see Fig. 1. 
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Fig. 1. The three patterns resulting from different triple offsets. 



For each one of Pi, P2 and P3, the central symbol in every triple (the one shaded 
in Fig. 1) has the same value as the respective element of P. All other symbols are 
“don’t care”s The sum of the mismatches of Pi in T, i = 1, 2, 3 is precisely 
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the mismatches of P in T. Therefore, half of this sum is the desired number of 
swaps. 

Throughout the remainder of this paper we will concentrate on counting the 
mismatches of P2- The cases of P\ and P 3 are similar. In the next section we 
will show a randomized algorithm that allows efficient counting of mismatches 
of P 2 by reducing the alphabet. Section 5 will show a deterministic alphabet 
reduction. 

4 Randomized Alphabet Reduction 

Let h : S ^ {1,2, ...,4} be chosen randomly. For string S = define 

h{S) = h{s\), ..., h{sm)- Consider h{P2). Let {x, y, z) be a triple such that x ^ y 
or y ^ z (i.e. a swap could happen). Call such a triple a potential swap triple. 
We say that h separates the triple (x, y, z) if h{x) yf h{y) when x ^ y and 
h{y) yf h{z) when y ^ z. 

If h happens to separate every potential swap triple in the pattern, then the 
number of mismatches of P2 in T equals the number of mismatches of h{P2) in 
h{T). However, the alphabet of h{P2) and h{T) is of size 4, hence the mismatches 
can be counted in time O(nlogm). 

We need to be quite lucky to achieve the situation where all potential swap triples 
get separated by h. However, we really do not need such a drastic event. Every 
potential swap triple that gets separated, counts all its mismatches. From now 
on it can be replaced by “don’t care”s and never add mismatches. Conversely, 
every potential swap triple that does not get separated can be masked by “don’t 
care”s and not contribute anything. 

Our algorithm, then, is the following. 

Algorithm 

Let Pt 4- P2 

Replace all non potential swap triples of Pt with "don’t care"s 
while not all triples have been masked do: 
choose a random h : P ^ {1, 2, ..., 4} 

Let Pg ^ h{Pt) 

Replace all non-separated triples of Pq with “don't care”s 
Count all mismatches of Pq in h{T) 

Replace all triples of Pt that were separated by h with “don't care”s 
end Algorithm 

Since counting all mismatches of Pq in h{T) can be done in time 0(n log m), it 
is sufficient to know the expected number of times we run through the while loop 
to calculate the expected running time of the algorithm. 
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Claim. The expected number of times the while loop executed in the above 
algorithm is 0(log<T), where cr = min(|A'|,m). 

Proof: The probability that a given potential swap triple gets separated is 




Therefore, the expectation is that at least half of the triples will be separated 
in the first execution of the while loop, with every subsequent execution of the 
while loop separating half of the remaining triples. Since there are no more 
than min(y , lifp) triples, then the expectation is that in 0(log(min(y , lAlp)) = 
0(log(j) executions of the while loop all triples will be separated. □ 

Conclude: The expected running time of the algorithm is 0(n log m log cr). 

5 Deterministic Alphabet Reduction 

Recall that our task is really to separate all triples. There exists in the literature 
a powerful code that does this separation. Subsequently we show a simple code 
that solves our problem. 

Definition: A {S, 3) -universal set is a set S' = {xi,...,Xfc} of characteristic 
functions, Xj ■ ^ ^ {Oj 1} such that for every a,b,c G A, and for each of the 
eight possible combinations of 0 — Is, there exists Xj such that Xi(®)) Xi(^)j Xi(c) 
equals this combination. 

We extend the definition of the functions Xj fo strings in the usual manner, i.e. 
for S = Si . . . s„, Xi(S) = Xi(si)Xi(s 2 ) • ■ • Xi(sn)- 

Let S = {xi, . ■ . , Xfe} be a be a {S, 3)-universal set such that for every potential 
swap triple (a, 6, c) there exist a j for which Xi(u) = 0> Xj(b) = 1, Xj(c) = 0. We 
run the following algorithm, which is very similar to the randomized algorithm 
in Section 4. 

Algorithm 

Let Pt ^ P 2 

Replace all non potential swap triples of Pt with "don’t care"s 
for j = 1 to fc do: 

Let Pg ^ Xj{Pt) 

Replace all non-separated triples of with “don't care”s 
Count all mismatches of Pg in Xj{P) 

Replace all triples of Pt that were separated by Xj with "don’t care"s 
end Algorithm 

In [16] it was shown how to construct (A, 3)-universal set of cardinality k = 
0(log a) yielding the following. 
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Corollary 1. The deterministic algorithm’s running time is 0(n log m log < j). 

The Naor and Naor construction of [16] is quite heavy. We conclude with an 
extremely simple coding of the alphabet that separates triples sufficiently well 
for our purposes. 

First note the following. 

Claim. It is sufficient to provide a set S' = {yi, . . . ,Xk\ of characteristic func- 
tions, Xj ■ ^ ^ {Oj 1} such that for every potential swap triple (a,b,c) there 
either exists a Xj such that Xj(a) = x,Xj(b) = 1 — x and Xj(c) = x, where 
X G {0,1}, or there exist Xji>Xj2 such that Xji(^) = = 1 — x and 

Xii(c) = l-x, and Xj2(a) = y,Xj2(b) = y and Xj2{c) = ^~y, where x,y G {0, 1}. 

Call such a set a swap separating set. 

Proof: Let S = be a swap separating set. Every potential swap 

triple for which there exists a Xj such that Xj (fi) = x, Xj (b) = 1 —x and Xj (c) = x, 
where x G {0, 1}, will be separated by xj and masked with “don’t care”s for 
all other characteristic functions. In other words, we initially decide, for each 
function, what are the triples it separates, and mark those triples. If several 
functions separate the same triple we will, of course, only use one of them. 

For every other potential swap triple, there are Xji > Xj 2 such that Xji (®) = 
X, Xji (b) = l-x and Xh (c) = l~x, and X 32 (a) = y^ Xj 2 (b) = y and X 32 (c) = 1-y, 
where x,y G {0,1}. Every such triple will participate in the separation of Xji 
and Xj2- For all other characteristic functions it will be masked with “don’t 
care”s. 

Note that if there is a match of such a triple with a text location without 
a swap, then neither Xji nor Xj2 will contribute a mismatch. However, if the 
triple’s match requires a swap, then exactly one of Xji or X 32 will contribute a 
mismatch. □ 

Our remaining task is to provide a simple construction for swap separating set 
of size 0(log(j). 

Swap Separating Set Construction 

Consider a cr x log a bit matrix B where the rows are a binary representation of 
the alphabet elements {PHU). Take Xj(a) = B[a,j]. In words, the characteristic 
functions are the columns of B. 

For every potential swap triple (a,b,c), if there is a column where the bits of 
a, 5, c are x,l — x,x then this column provides the function in which the triple 
participates. If no such column exists, then there clearly are two columns ji,j 2 
such that B[a,ji] yf B[b,ji] and H[c,j 2 ] yf B[b,j 2 \. It is clear that B[c,ji] = 
B[b,ji] and H[a, J 2 ] = B[b,j 2 ] (otherwise the first condition holds). The columns 
ji and j 2 provide the functions where triple (a, 5, c) participates. 
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6 Conclusion and Open Problems 

We have shown a faster algorithm for the approximate swap matching problem 
than that of the pattern matching with mismatches problem. This is quite a sur- 
prising result considering that it was thought that swap matching may be even 
harder than pattern matching with mismatches. However, this leads us to conjec- 
ture that the current upper bound on the mismatches problem (0(n\/m log to)) 
is not the final word. 

The swap operation and the mismatch operation have proven to be relatively 
“easy” to solve. However, insertion and deletion are still not known to be solvable 
in time faster than the dynamic programming 0(nm) in the worst case. A lower 
bound or a better upper bound on the complexity of edit distance would be of 
great interest. 
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Abstract. This paper extends DeNicola and Hennessy’s testing theory 
from labeled transition system to Biichi processes and establishes a tight 
connection between the resulting Biichi must-preorder and satisfaction of 
linear-time temporal logic (LTL) formulas. An example dealing with the 
design of a communications protocol testifies to the utility of the theory 
for heterogeneous system design, in which some components are specified 
as labeled transition systems and others are given as LTL formulas. 



1 Introduction 

Approaches to formally verifying reactive systems typically follow one of two 
paradigms. The first paradigm is founded on notions of refinement and is em- 
ployed in process algebra [2]. In such approaches one formulates specifications 
and implementations in the same notation and then proves that the latter refine 
the former. The underling semantics is usually given operationally, and refine- 
ment relations are formalized as preorders. Testing/failure preorders [4, 8] have 
attracted particular attention because of their intuitive formulations in terms of 
responses a system exhibits to tests. Their strength is their support for compo- 
sitional reasoning, i.e., one may refine part of a system design independently of 
others, and their full abstractness with respect to trace inclusion [18]. 

The other paradigm relies on the use of temporal logics [22] to formulate 
specifications, with implementations being given in an operational notation. One 
then verifies a system by establishing that it is a model of its specification; model 
checkers [5] automate this task for finite-state systems. Temporal logics support 
the definition of properties that constrain single aspects of expected system 
behavior and, thus, allow a “property-at-a-time” approach. Such logics also 
have connections with automata over infinite words. For example, linear-time 
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temporal logic (LTL) specifications may be translated into Buchi automata [27] 
which allow semantic constraints on infinite behavior to be expressed. 

The objective of this paper is to develop a semantic framework that seam- 
lessly unifies testing-based refinement and LTL, thereby enabling the devel- 
opment of design formalisms that provide support for both styles of verifica- 
tion. Using Biichi automata and the testing framework of DeNicola and Hen- 
nessy [8] as starting points, we approach this task by developing Biichi may- 
scad must-preorders that relate Biichi processes on the basis of their responses to 
Biichi tests. Alternative characterizations are provided and employed for proving 
conservative-extension results regarding DeNicola and Hennessy’s testing the- 
ory. We then apply this framework to defining a semantics for heterogeneous 
design notations, where systems are specified using a mixture of labeled transi- 
tion systems and LTL formulas. This is done in two steps: first, we show that 
our Biichi must-preorder is compositional for parallel composition and scoping 
operators that are inspired by CCS [19]. Second, we establish that the Biichi 
must-preorder reduces to a variant of reverse trace inclusion when its first argu- 
ment is purely nondeterministic. Consequently, the Biichi must-preorder permits 
a uniform treatment of traditional notions of process refinement and LTL satis- 
faction. The utility of our new theory is illustrated by means of a small example 
featuring the heterogeneous design of a generic communications protocol. 

2 Biichi Testing 

We extend the testing theory of DeNicola and Hennessy [8], which was developed 
for labeled transition systems in a process-algebraic setting, to Biichi automata. 
Traditional testing relates labeled transition systems via two preorders, the may- 
and must-preorders, which distinguish systems on the basis of the tests they 
might be able to, or are necessarily able to, pass. Biichi automata generalize 
labeled transition systems by means of an acceptance condition for infinite traces. 
However, the classical Biichi semantics, which identifies automata having the 
same infinite languages, is in general not compositional with respect to parallel 
composition operators, since it is insensitive to the potential for deadlock. Our 
testing semantics is intended to overcome this problem. In the sequel, we refer 
to Biichi automata as Biichi processes to emphasize that we are equipping Biichi 
automata with a different semantics than the traditional one. 

Basic Definitions. Our semantic framework is defined relative to some alpha- 
bet A, i.e., a countable set of actions which does not include the distinguished 
unobservable, internal action r. In the remainder, we let a,b, . . . range over A 
and a, /3, . . . over AU {t}. Biichi processes are distinguished from labeled tran- 
sition systems in their treatment of infinite traces. Whereas in labeled transition 
systems all infinite traces are typically deemed possible, in Biichi processes only 
those infinite traces that go through designated Biichi states infinitely often are 
considered actual executions. 

Definition 1 (Biichi process). A Biichi process is a tuple {P, — >, yj,p) , where 
P is a countable set of states, — > C Px (.4U{r}) x P is the transition relation, 
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y/ C S is the Biichi set, and p G P is the start state. If = P we refer to the 
Bilchi process as a labeled transition system. 

For convenience, we often write (i) p' p" instead of {p', a,p") G — >, (ii) p' 
for 3p" G P.p' p", (iii) p' — > for 3a G {r}, p” G P. p' p" , and 

(iv) p'y/ for p' G yj ■ If no confusion arises, we abbreviate the Biichi process 
(P, — >, ^ -,p) by its start state p and refer to its transition relation and Biichi set 
as — >p and respectively. Moreover, we denote the set of all Biichi processes 
by V. Note that we do not require Biichi processes to be finite-state. 

Definition 2 (Path & trace). Let (P, — >, yj,p) he a Biichi process. A path tt 
starting from state p' G P is a potentially infinite sequence {{pi-\, o;i,Pi))o<i<fc; 
where k G NUjoo}, such that k = 0, orpo = p' andpi-i pi, for allO < i < k. 
We use |7t| to refer to k, the length of tt. If \tt\ = oo, we say that tt is infinite; 
otherwise, tt is finite. If |7t| G N and p|,r| *-e-> P\tt\ « deadlock state, 

path TT is called maximal. Path tt is referred to as a Biichi path if \tt\ = oo and 
|{z € N|pi-y}| = oo. The (visible) trace trace^Tr) of tt is defined as the sequence 
G U A°° , where 1^, =df {0 < z < |7 t| | a* yf t}. 

We denote the sets of all finite paths, all maximal paths, and all Biichi paths 
starting from state p' G P by Pfin(p')> n^axip'), and IIb{p'), respectively. The 
empty path tt with |7t| = 0 is symbolized by () and its trace by e. We sometimes 
write a for the empty or single-element sequence trace (a) and use the notation 
p' p" to indicate that state p' of Biichi process p may evolve to state p" 
when observing trace w for some path tt G IIf,n{p'). Formally, p' =^p p" if 3tt = 
{{pt-i,ai,pi))o<i<k G Pfin(p).po = p', Pk = p", and trace(Tr) = w. Moreover, 
Ip(p') =df {a G A\3p" .p' p"} is the set of initial actions of p in state 

p' G P. We may also introduce different languages for Biichi process p. 

Ain(p) =df {trace(Tr) | tt G Pfin(p)} C A* finite-trace language of p 

>Cmax(p) =df {trace(Tr) | tt G Pmax(p)} C A* maximal-trace language of p 

>Cb(p) =df {trace(Tr) | tt G Pb(p)} C A* U A°° Biichi-trace language of p 

A key notion in testing-based semantics is divergence, i.e., a system’s ability 
to engage in an infinite internal computation. In this paper, we use adapta- 
tions of the traditional notions of DeNicola and Hennessy [8]; more sophisti- 
cated definitions may be found elsewhere in the literature [3, 21, 23] but are 
not considered here. We say that state p' of Biichi process p is {Biichi) di- 
vergent, in symbols p' (l-p, if 3tt G IIb{p'). trace(Tr) = e. State p' is called 
w-divergent for some w = (ai)o<i<fc G A* UM°°, in symbols p' 'ftp w, if one 
can reach a divergent state starting from p' when executing a finite prefix 

of w, i.e., if 31 < k, p" G P,w' G A*, w' = {ai)o<^i<i,p' p” , and p" -ftp. 
For convenience, we write Ld\y{p') for the divergent-trace language of p' , i.e., 
Pdw{p') =df {zc G U A°° \ p' "ftp zc}. State p' is convergent or zc-convergent, in 
symbols p' IJ-p and p' IJ-p w, if not p' "ffp and not p' "ffp w, respectively. Note that a 
finite trace w G Ts{p) indicates that p is divergent exactly after executing w. In 




A Semantic Theory for Heterogeneous System Design 



315 



the following, we often omit the indices of the divergence and convergence pred- 
icates, as well as of the transition relations, whenever these are obvious from the 
context. Finally, we write w-w' for the concatenation of finite trace w & A* with 
the finite or infinite trace w' & A* VJ A°° . 

Testing Theory. The testing framework of DeNicola and Hennessy defines 
behavioral preorders that relate labeled transition systems with respect to their 
responses to tests [8]. Tests are employed to witness the interactions a system 
may have with its environment. In our setting, a test is a Biichi process in 
which certain states are designated as successful. In order to determine whether 
a system passes a test, one has to examine the finite and infinite computations 
that result when the test runs in lock-step with the system under consideration. 

Definition 3 (Test, computation, success). A Biichi test {T, — ^f,ti Sue) 
is a Biichi process (T, — '^,\/,t) together with a set Sue C T of success states. 
If =0, we call the test classical. The set of all Biichi tests is denoted by T. 

A potential computation c with respect to a Biichi process p and a Biichi 
test t is a potentially infinite sequence {{pi-i,ti-i) ' — {PiAi))o<i<k, where 
A: G N U {oo}, such that (1) Pi G P and ti G T, for all 0 < i < k, and (2) at G 
AU {r} and ri G {■<, ►, ♦}, for allO < i < k. The relation i — > is defined by: 

• (pi-i,ti-i) (piAt) if a* = D U-i = ti, p^-l Pi, & U-i ^ Sue. 

• 1-^ (pi,U) if a* = T, = Pi, U-i U, & U-i ^ Sue. 

• (pi,U) if a* G A, Pt-i SAp Pj, t^-l SAt U, & U-i ^ Sue. 

The potential computation c is finite if \c\ < oo and infinite if \c\ = oo. The 
projection projp{c) of c on p is defined as {{pi-\,oii,pi))i^ic G n{p), where 
Ip =df {0 < z < A: I Ti G {■<, ♦}}, and the projection projt{c) of c on t as 
((Aj_i,o;i,Aj))jg/c G n{p), where If =df {0 < z < A: | G {►, ♦}}. A potential 

computation c is called a computation if it satisfies the following properties: 
(1) c is maximal, i.e. if |c| < oo then {p\c\A\c\T^r for any a and r; and (2) if 
|c| = oo then projp{c) G IIb(j>). The set of all computations of p and t is denoted 
by C{p,t). 

Computation c is called successful if t\c\ G Sue, in case |c| < oo, or if 
projt(c) G nsit), in case |c| = oo. We say thatp may pass t, in symbols p maycL t, 
if there exists a successful computation c G C{p,f). Analogously, p must pass t, 
in symbols p mustcht, if every eomputation c G C{p,t) is successful. 

Intuitively, an infinite computation of process p and test t differs from an infinite 
potential computation in that in the former the process is required to enter a 
Biichi state infinitely often. An infinite computation is then successful if the test 
also passes through a Biichi state infinitely often. Hence, in contrast with the 
original theory of DeNicola and Hennessy, some infinite computations can be 
successful in our setting. Also, since Biichi processes and Biichi tests potentially 
exhibit nondeterministic behavior, one may distinguish between the possibility 
and inevitability of success. This is captured in the following definitions of the 
Biichi may- and mzzs A-preorders. 
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Definition 4 (Biichi Preorders). For Biichi processes p and q we define: 

• p q if yt G T. pmaycht implies qmayci^t. 

• p q if yt G T. p mustcL t implies q mustcL t- 

It is easy to check that !=CL^* preorders. The classical may- and 

must-preorders of DeNicola and Hennessy are defined analogously, but with 
respect to transition systems and classical tests [8]. Note that in this paper 
we consider the Biichi may-preorder only for the sake of completing the Biichi 
testing theory; it is not used in our semantic framework for heterogeneous system 
specification. 



3 Alternative Characterizations and Conservativity 

We now present characterizations of our Biichi preorders and use these charac- 
terizations as a basis for comparing DeNicola-Hennessy testing theory [8] ours. 

Theorem 1. Let p and q he Biichi processes. Then 

1- P ^CL 9 */ ^fin{p) C Cfin{q) and £b{p) C Te(g)- 

2. p 9 */ if for all w G A* U A°° such that p fiw : 

(a) g ij. w 

(b) |w| < oo; Vg'. g q' implies 3p' -p p' and Tp{p') Q Tg(g'). 

|w| = oo; w G Csiq) implies w G Tb{p)- 



With respect to finite traces, the characterizations are virtually the same as 
the ones of DeNicola and Hennessy’s preorders [8]. However, we need to refine 
the classical characterizations in order to capture the sensitivity of Biichi may- 
and must-testing to infinite behavior. The proof of this theorem relies on the 
properties of several specific Biichi tests. Some of them are standard [8]; the other 
ones are depicted in Fig. 1, where (i) w = (ai)o<i<fc G A* for tests and 

^must.max (y) ^ ^ (a^^eN G A°° for tests , tl, and In Fig- h 

Biichi states are marked by the symbol ^/, and success states are distinguished 
by thick borders. 

Intuitively, while Biichi test t'ff^'^’°° tests for the presence of Biichi trace w, 
Biichi tests and tf^ are capable of detecting divergent behavior when 

executing trace w. Moreover, Biichi tests and are concerned with 

the absence of maximal trace and Biichi trace w, respectively. These intuitions 
are made precise by the following properties, which hold for any Biichi process p. 

1. Let w G Then w G Cb{p) if and only if p maycL 

2. Let w G A* . Then w G Cb{p) if and only if p maycL 

3. Let w G A°°. Then p fj. w if and only if p mustcL tit,- 

4. Let w G A* s.t. p w. Then w ^ Tmax(p) if and only if p mustcL 

5. Let w G A°° s.t. p w. Then w ^ Tb(p) if and only if p mustcL 
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Fig. 1. Biichi tests used for characterizing the Biichi may- and must-preorders. 



The proof of Thm. 1, which can be found in [7], relies on these properties of 
Biichi tests. Specifically, it uses the infinite-state tests t^, and 

The employment of infinite-state tests — even when relating finite-state Biichi 
processes — is justified by our view that Biichi tests represent the arbitrary, 
potentially irregular behavior of the unknown system environment. 

Using the above characterizations, we investigate the relation of our Biichi 
preorders to the corresponding classical preorders, E|dh*i respectively, 

as defined by DeNicola and Hennessy [8] . It should be noted that their framework 
is restricted to image-finite labeled transition systems and classical, image-finite 
tests; a labeled transition system or Biichi process is called image-finite if every 
state has only a finite number of outgoing transitions for any action. 

Theorem 2. Let p and q be image-finite labeled transition systems. 

1. If p and q are convergent, then p q if and only if p Ed^ q- 
P Ec“* q if and only if p Edh* q- 

We refer the reader to [7] for the proof of this theorem. In a nutshell, the 
second part follows by inspection of the alternative characterizations of Ecl^* 
and Edh*- Thm. 2(1) is invalid if one allows divergent labeled transition systems. 
As a counterexample consider the transition systems ({p}, {{p, T,p)}, {p},p) and 
({g}, 0 , {g}, g), as well as the Biichi test {{t}, {{t, r, t)}, {t}, t, 0 ). Then, p Edh d 
since Cf,n{p) = Efin(g) = {e}, but pEcl" d since pmaycL^ and qrrykycht. 

4 Biichi Testing and Heterogeneous System Design 

In this section we investigate the utility of our theory as a semantic framework for 
heterogeneous design notations that mix labeled transition systems and formulas 
in LTL. The design methodology which we wish to support is component-based, 
where a system designer starts off with a system architecture, with components 
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given either as automata or, more abstractly, as LTL formulas. Then the system 
is refined by successively implementing each component as a labeled transition 
system satisfying its specification. To support such a methodology mathemat- 
ically, one needs a refinement preorder which satisfies at least two properties. 
First, it must be compositional for key operators of such design languages. Sec- 
ond, it must be “compatible” with the LTL satisfaction relation. We show that 
our Biichi must-preorder obeys both properties. 



Biichi Testing and Compositionality. In the component-based design frame- 
work we wish to study, two operators are central: (i) parallel composition, for 
connecting concurrent components and allowing them to interact via system 
channels, and (ii) restriction, for restricting access to channels to certain system 
components. In the following, we introduce two such operators that allow us to 
give the reader hints about the application of the semantic theory developed so 
far. While other operators are of course possible, the ones considered here suffice 
for the purposes of the example in the next section. 

Our parallel composition operator “|” and the restriction operator \A, where 
AC A, are inspired by the ones in the process algebra CCS [19]. We assume that 
alphabet A is composed of two sets A\ and A7, representing sending and receiving 
actions, such that for every a! G Al there exists a corresponding a? G A7, 
and vice versa. Here, a should be interpreted as a channel name. The intuition 
for parallel composition in CCS is that a process willing to send a message on 
channel a and another one able to receive a message on a can do so by performing 
the actions a! and a? in synchrony with each other. This handshake is invisible to 
an external observer, i.e., it results in the distinguished, unobservable action r. 
When adapting the CCS parallel operator to our framework of Biichi processes, 
the question that naturally arises concerns the interpretation of Biichi traces. 
We adopt the following point of view: intuitively, “fair merges” of Biichi traces 
of p and q should also be Biichi traces of p\q. Moreover, a Biichi trace of one 
process, when merged with a finite trace of the other process, should also result 
in a Biichi trace oi p\q. 

Formally, our parallel composition of Biichi processes {P , — >p,-\/p)P) 

{Q, — >q,Vqjl) is defined as the Biichi process {P\Q, — ^p\q, Vp\q^P\l)j where 
P\Q =df {p'W I p' G P, g' G Q} U {q'\p' \p' G P,q' G Q} and where — >p\g is the 
least relation such that: 



( 1 ) p'JApp" 

( 2 ) p'P^pp” 

(3) q'P^qq" 

(4) p' CAp p” and q' q" 

(5) p' -^p p" and q' q" 

(6) p — >p p and q — q 

(7) p — p and q — q 



implies p'\q' ~^p\q q'\p" 
implies p'\q' -^p\qP''W 
implies p'\q' ~^p\q q"\p' 
implies p'\q' ~^p\q q”\p" 
implies p'\q' P^p\qP''\q” 
implies p'\q' ~^p\q q"\p" 
implies p'W P^p\qP”\q" 



if p'Wp 
if not p' A p 

if P'Wp 
if not p' A p 

if P'Wp 

if not p' A p 



These rules are in accordance with our above-mentioned intuition of system 
behavior. The “switching” of the states of p and q in Rules (1), (3), (4), and (6) 
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allows us to fairly merge “Biichi traces with Biichi traces” and “Biichi traces with 
finite traces” of the argument Biichi processes. Finally, the Biichi predicate \J 
is defined by v'Vl \/ p\q if P'Vpj fo'^ p' & P and q' G Q. The unary restriction 

operator \A, for A A, essentially is a scoping mechanism on channel names. 
Intuitively, p \ A is defined as the Biichi process p, except that all transitions 
labeled by actions a! and a?, where a G A, are eliminated. 

By referring to the characterizations of the Biichi may- and must-preorders 
one can establish the desired compositionality results: the Biichi may- and must- 
preorders are substitutive under parallel composition and restriction. 

Biichi Must— testing and LTL Satisfaction. We now show that the Biichi 
must-preorder is compatible with the LTL satisfaction relation which relates 
labeled transition systems and LTL formulas [22]. By “compatible” we mean 
that, for every LTL formula (j), there exists a Biichi process such that the 
following holds for any labeled transition system p: p \= (f>ii and only if 
p, i.e., the ‘implementation’ p refines the ‘specification’ (j). 

To achieve this goal, we characterize the Biichi must-preorder for a certain 
class of Biichi processes by means of trace inclusion. We call a Biichi process p 
purely nondeterministic, if for all p' G P\ (i) p' — implies p' -y^p, for all a G A, 
and (ii) \{{a,p”) G Ax P\p' ~^p p”}\ = 1. Note that a Biichi process p can be 
transformed to a purely nondeterministic Biichi process p' , such that Cd\v{p) = 
Cd\v{p'), Pfm{p) = Pfm{p'), Pwaxip) = Cnsax{p'), and £ b ( p ) = Cs{p'), by splitting 
every transition p' — ^p p" into two transitions p' — ^p P(p',a,p") ~^p p'\ where 
P(p' ,a,p") ^ P is & new, distinguished state. 

Theorem 3. Let p and q be Biichi processes and p be purely nondeterministic. 
Then, p q if and only if (i) Cdiv(q) C Cdivip), (H) Cfin{q)\Cdivip) C Cfinip), 

(Hi) \ Pdiv{p^ ^ Pmaxip)) and (iv) \ Pdiv{,p) ^ Pb{p^. 

The necessity of the premise of this theorem, whose proof is in [7], may be 
demonstrated by Biichi processes p =df {{pi,p 2 },{{pi,a,pi),{pi,b,p 2 )},i),pi) 
and q =df {{qi,q2},{{qi,b,q2)},9,qi). Then p is not purely nondeterministic and 
Inclusions {i)-{iv) obviously hold, but pUcl^ d since p mustcL t and q rpfistcL t, 
for the Biichi test t =df ({^ 1 ,^ 2 }, {(fi, a, ^ 2 )}, 0, ti, {^ 2 })- 

The above theorem is the key for establishing the desired connection between 
the Biichi must-preorder Cql®* and the satisfaction relation \= for LTL. In partic- 
ular, well-known constructions — starting with the seminal work of Vardi and 
Wolper [27] — exist for converting LTL formulas into Biichi automata whose 
languages consist precisely of the models of the corresponding formulas. These 
constructions may be adapted to yield purely nondeterministic Biichi processes. 
However, there are a few subtleties of our setting compared to the traditional 
one on which we need to comment. First of all, our framework is concerned with 
labeled transition systems, so we must be able to interpret LTL formulas with 
respect to sequences of actions rather than states. Also, our framework is not 
only concerned with Biichi traces but also with finite traces (i.e., deadlocks) and 
divergent traces. The syntax and semantics of LTL may be modified to cope with 
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these new phenomena; the details are not difficult and are omitted. The classical 
constructions of Biichi automata from LTL formulas may then be adapted to 
cope with the modifications to the logic. Whereas the adaptation for deadlock is 
well-known [17], the handling of divergence requires some attention. Intuitively, 
in a Biichi process a divergent state may engage in arbitrary behavior; this is 
reflected in its divergence language, which is A*UA°° (cf. Sec. 2). The only LTL 
formulas satisfied by arbitrary behavior are tautologies. Hence, in the Biichi pro- 
cess construction for LTL formulas, every state which corresponds to a tautology 
needs to be made divergent. Having these twists in mind, one may obtain the 
following variant of the key theorem for automata-based LTL model checking 
(cf. [27]), where denotes the Biichi process constructed for LTL formula <j). 

Theorem 4. Let p he a labeled transition system and (j> be an LTL formula. 
Then, p\= 4> if and only if Inclusions (i)-(iv) in Thm. 3 hold for and p 
(i.e., replace p in Inclusions (i)-(iv) by B,p and q by p). 

Note that the “=l>” -direction of Thm. 4 is invalid if p is allowed to be an arbi- 
trary Biichi process rather than a labeled transition system. As a counterexample 
consider p=df ({pi,P2,P3}, {(pi, 0 ,^ 2 ), (pi, (P3, &,P3)}, 0,Pi) and (j> =dt a. 
Then p ]= a as ^ >Cb(p) and b S £fin(p)\Tdiv(.B 0 )- But obviously b ^ £f\„{Bfj,). 
When transforming B^ to a purely nondeterministic Biichi process B^ as out- 
lined above, we may combine Thms. 3 and 4 to obtain our desired result. 

Corollary 1 (Biichi must— testing and LTL). Let p be a labeled transition 
system and (j> be an LTL formula. Then, p\= 4> if and only if B^ 

Hence, our notion of Biichi must-testing not only extends DeNicola and Hen- 
nessy’s [8] and Narayan Kumar et al.’s [20] must-preorders to (arbitrary) Biichi 
processes, but is also compatible with the LTL satisfaction relation. 



5 Example 

As an example for the utility of our theory for heterogeneous system design, 
consider the design of a very simple communications protocol given in Fig. 2. 



send reev 




Fig. 2. A simple communications protocol. 
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The architecture of the protocol has already been fixed by the system designer 
and consists of a sender Sender, a medium Medium, and a receiver Receiver. 
The components communicate with the protocol’s environment and among them- 
selves via channels. In case of component Sender, these are the channels send, 
put, and gack {get acknowledgment). Each component in turn has its own speci- 
fication. Receiver and Medium are given as labeled transition systems, reflecting 
the fact that their designs are relatively advanced. Sender, in contrast, is speci- 
fied as an LTL formula stating that whenever a send? action occurs during an 
execution sequence of the sender, the remainder of the execution must begin 
with a sequence of put ! actions followed by a gack? action. Finally, the overall 
specification of the protocol’s required behavior may be given by the LTL for- 
mula Spec =df G (send? — > (F recv!)) . This formula dictates that in any 
sequence of actions which the system performs, whenever a send? action occurs, 
a recv! action eventually follows. An obvious question that a designer would 
be interested in is whether the specification of the sender is “strong enough” 
to ensure that the protocol satisfies Spec. The theory developed in this paper 
provides the semantic framework for answering this question. To do so, we first 
construct the purely nondeterministic Biichi process i?spec for LTL formula Spec, 
as well as Biichi process i?sender for LTL formula </)sender- Next we assemble the 
overall system by employing our parallel composition and restriction operators. 

System =df (i?sender | Medium | Receiver) \ {put, get, pack, gack} 

Finally, we determine whether or not Bs-pec System; this indeed holds. 

The development of an efficient algorithm for automatically determining 
whether two Biichi processes are related by is future work. However, the 

alternative characterization of Ecl^* (of- Thm. 1) already provides some hints 
about how this can be done. Due to the compositionality of the Biichi must- 
preorder, our positive answer is preserved 
with any Biichi process p such that Hsender Ecl,^* P- 
If p is a labeled transition system and Hgender is 
made purely nondeterministic, then Bgender Ecl^* P 
holds exactly when p ^ ^sender, according to Cor. 1. 

One such p is depicted to the right. 

6 Related Work 

Starting with the same motivation we did, Abadi and Lamport have developed 
ideas for heterogeneous specification for shared-memory systems [1]. Their tech- 
nical setting is the logical framework of TLA [15], in which processes and tem- 
poral formulas are indistinguishable and logical implication serves as the refine- 
ment relation. TLA refinement coincides in some sense with trace inclusion in 
our testing scenario and is therefore insensitive to deadlock and divergence. Such 
issues are not of concern in the shared-memory world but must be dealt with 
in our setting, which is targeted towards specifying distributed systems in which 
components can interact directly, rather than indirectly via the shared memory. 



when replacing Bsender 

put 
gack 
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Of direct relevance to this paper is the work of Kurshan [14], who devel- 
oped a theory of to -word automata that includes notions of synchronous and 
asynchronous composition. However, his underlying semantic model maps pro- 
cesses to their maximal (infinite) traces, and the associated notion of refinement 
is (reverse) trace inclusion. In theories of concurrency such as CCS [19] and 
eSP [4], in which deadlock is possible, maximal trace inclusion is not compo- 
sitional [18]. In contrast, our must-preorder is compositional, at least for the 
operators presented here. Other work [6, 10, 13] investigated modular and com- 
positional model-checking in similar non-deadlock environments. 

Relatively more work has been devoted to analyzing relationships between re- 
finement and logical approaches. One line of study relates temporal-logic specifi- 
cations to refinement-based ones by establishing that one system refines another 
if and only if both satisfy the same properties. Results along these lines were 
pioneered by Hennessy and Milner [11] for bisimulation equivalence and a modal 
logic of their devising [19]. Similar ideas were also adapted regarding other be- 
havioral equivalences and preorders and other temporal logics [9, 19, 25]. Con- 
gruences preserving “next-time-less” LTL have been studied by Kaivola and 
Valmari in [12]; the results have subsequently been extended to handle dead- 
lock [26] and livelock [23]. Our work differs from theirs in that we want to have 
LTL formulas embedded in specifications. 

Another line of research involves the encoding of labeled transition systems 
as logical formulas, and vice versa. Steffen and Ingolfsdottir [24] defined an algo- 
rithm for converting finite-state labeled transition systems into formulas in the 
mu-calculus, while Larsen [16] demonstrated that certain mu-calculus formulas 
can be encoded as bisimulation-based implicit specifications. Finally, traditional 
testing has also been enriched with notions of fairness [3, 21] in order to con- 
strain infinite computations in labeled transition systems. 

7 Conclusions and Future Work 

We conservatively extended the testing theories of DeNicola and Hennessy [8] 
and Narayan Kumar et al. [20] to Biichi processes. We then studied the de- 
rived Biichi may- and must-preorders, developed alternative characterizations 
for them, argued that the preorders are substitutive for several operators nec- 
essary for component-based system design, and showed that the Biichi must- 
preorder degrades to a variant of reverse trace inclusion when its first argument 
is purely nondeterministic. Using the latter result, we illustrated that Biichi 
must-testing provides a uniform basis for analyzing heterogeneous system de- 
signs given as a mixture of labeled transition systems and LTL formulas. 

Regarding future work, we plan to develop specification languages mixing 
process algebras and LTL, which are given a semantics in terms of Biichi testing. 
We also intend to explore algorithms for computing our Biichi must-preorder. 
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Abstract. This paper presents the first formal verification of the Ricart- 
Agrawala algorithm [RA81] for distributed mutual exclusion of an arbi- 
trary number of nodes. It uses the Temporal Methodology of [MP95a]. 
We establish both the safety property of mutual exclusion and the live- 
ness property of accessibility. To establish these properties for an arbi- 
trary number of nodes, parameterized proof rules are used as presented 
in [MP95a] (for safety) and [MP94] (for liveness). A new and efficient 
notation is introduced to facilitate the presentation of liveness proofs by 
verification diagrams. 

The proofs were carried out using the Stanford Temporal Prover (STeP) 
[BBC^95], a software package that supports formal verification of tem- 
poral specifications of concurrent and reactive systems. 



1 Introduction 

The Ricart-Agrawala algorithm (RA) [RA81] for achieving mutual exclusion 
in a network is one of the venerable and well-known algorithms in distributed 
computing. Nevertheless, the correctness of the algorithm has not been formally 
verified. 

The only previous attempt to formally prove the RA algorithm is the un- 
published work [Kam95], but it is restricted to the safety property of mutual 
exclusion and uses a simplified model. On the other hand, already [Lamp82] 
presented a non-mechanized proof of a similar algorithm. 

The main motivation for this work was to attempt a fully mechanized formal 
deductive proof of the RA algorithm, establishing both its safety and liveness 
properties, and using the deductive methods of [MP91]. 

These methods have been mechanized in a software package called the Stan- 
ford Temporal Prover (STeP) [BBC“''95]. A further motivation of this work 
was to push STeP to its limits, and see whether it could be used to prove an 
algorithm whose correctness proofs are quite complex. 

* This research was supported in part by the Minerva Center for Verification of Reac- 
tive Systems, and a grant from the U.S.-Israel bi-national science foundation. 



S. Kapoor and S. Prasad (Eds.): FST TCS2000, LNCS 1974, pp. 325—335, 2000. 
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We successfully generated formal proofs of both mutual exclusion and acces- 
sibility using STeP. This research points the way for further improvements both 
in proof techniques and in software support for deductive verification methods. 



2 Implementation of the Ricart-Agrawala Algorithm 



To verify the RA algorithm, we have written it in a formal programming nota- 
tion, the language SPL which is used in [MP91] as the programming language 
(Figure 1). 



in N 
local chq, chp 



y,z 

type Nar 
Bar 
value mini 



integer where N >2 

array [1..A, l..A^] of channel [1..] of integer 

where chq = A, chp = A 

[1..A] where y=l 
array of integer 

array of boolean 

Nar X Bar -r [1..A] 



N 

1 1 Node [s] 

S = 1 



local osn, hsn,p, c : integer where osn = 0, hsn = 0, c = 0,p = 1 
res : boolean where res = F 

rd : array [l .A^] of boolean where rd = F 

loop forever do 
r mi : noncritical 

m 2 : (res := T ; osn := hsn -|- 1; c := N-l;p := 1; 

y := mini(osn, res)) 



M 



mai : while p < N do 

m 32 : (if p / s then chq[s,p] <= osn\p ■.= p -\- 1) 

m 4 : await c=0 
m 5 : critical 

me : (res := F-,p := l',y := mini{osn, res)) 
mji : while p < N do 

mj 2 '■ (if rd\p] then \rd\p] ■.= F] chp[s,p] <C= l\,p\=p+l) 



II 

Q 

II 



local rq : integer 
loop forever do 

. chq\t, s] rq 
/ if hsn < rq then hsn := rq 
\ if (rq, t) ■< (osn, s) V ~^rcs 

then chp[s,t] <C= 1 else rd[t] := T 



P 



N 

U=1 



local rp : integer 
loop forever do 

ri : (chp[u, s] ^ rp; c := c — 1) 



Fig. 1. Implementation of the RA algorithm. 
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The structure of the program is as follows: we assume that there are N nodes, 
where Af is a parameter of the program which stays fixed during execution. Each 
node is a concurrent process: in the notation Node[s] :: [. . .], the ellipsis indicates 
the program text for the s’th node and s is the index of the node which may be 

N 

referenced within the program text. The entire program is || Node[s], implying 

a concurrent execution of all the nodes. 

The nodes are connected to each other in a complete graph: there is a pair 
of uni-directional asynchronous channels connecting each node to every other 
node, where chq is the outgoing channel for the REQUEST messages and chp 
is the incoming channel for the REPLY messages. The notation for output is 
chq[a,b] <1= e, meaning that node a sends the value of expression e to node b 
along channel chq, and similarly, chq[a,b] x, means that node b removes the 
value coming from a and assigns it to x. 

The additional global declarations are discussed below. 

The program for process Node is composed of three concurrent processes: 

— M is the main process containing the critical section and the protocols to 
be executed upon entry and exit. 

— P is the process which receives and counts replies. 

— Q is the process which receives requests and decides if to reply or to defer 
the reply. 

Note that P and Q are themselves composed of concurrent processes, one for each 
channel. Within Node[s], process Q[t] (which can also be identified as Q[t, s]) is 
responsible for reading messages from channel chq[t,s]. Similarly, process P[u] 
{P[u, s]) is responsible for reading messages from channel chp[u, s]. The synchro- 
nization among the processes within the same node is based on shared variables, 
and we use the notation <...>, to imply that the statements are to be executed 
atomically. This can be easily implemented using semaphores. We include the 
assignments c := N-1 and p := 1 within the atomic statement of line m 2 , and 
p := 1 within line me, to reduce the number of verification conditions in the 
proof of accessibility. 

Within each node there are global variables which are shared among the 
processes of that node: 

— osn - the sequence number chosen by the node. 

— hsn - the highest sequence number seen in any request message received by 
the node. 

— res - a flag that is true if the node is requesting to enter the critical section. 

— c - a counter of the number of outstanding reply messages. 

— rd - an array that lists deferred requests. rd[j] is true when the node is 
deferring a reply to the request from node j. 

— p, rp, rq are auxiliary variables and could have been declared as local to the 
processes of the node. 
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The following variables are not needed by the algorithm; they were added to 
facilitate the proof. 

— z is the index of a generic node, which is used to specify and verify accessi- 
bility. 

— y is the index of the node with the minimal value of the rank (osn[i],i), 
where the minimum is taken over all nodes i such that res [i] is true. If res [i] 
are all false, y = 1 . 

— mini{osn, res) is a function that computes y, the index of the node with the 
minimal rank. 



3 Proof of the Mutual Exclusion Property 

Invariance properties of the form □ p, where p is an assertion (a state formula) 
can be verified by the invarianee rule b-inv, given by 



Rule B-INV II. 0 ^ if 

12 . (p A p' For every transition t a T 

□ p 



where Q is the initial condition and T is the set of transitions of the verified sys- 
tem. An assertion satisfying premises II and 12 of rule b-inv is called induetive. 

In our case, the main invariance property is that of Mutual exelusion, which 
can be specified as 

PROPERTY excl: □ Vz, j : [l..A^] : TO5 [z] A m^lj] ^ i = j 

Here and below, we use m5[z] to denote that process M[i] is currently executing 
at location m^. 

3.1 Bottom Up Assertions 

At first we use a bottom-up approach to deduce some simple properties of the 
program. 

Locations at which res = 1 A first observed property is 

PROPERTY rcs_range: □ (711.31,32,4,5,6 [z] ^ rcs[z]). 

Note, that whenever there is a free index, such as z in the above property, there 
is an implicit universal quantification, implying that the property holds for every 
z G [ 1 ..N]. 

Range of p\i\ The variable p serves as a loop counter for the loops at 
statements TO31 and myi. The upper limit of these two loops is N. 

PROPERTY p_range: □(! < p[i] < A -|- 1 — 77132,72(7]) 
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This inductive assertion claims that p\i] < Af + 1 at all locations, except for 
locations ms2 and mn, where the stronger inequality p[i] < N holds. 

The Message Chain Linkage 

PROPERTY message_chain: 

□ ((W3i,32[z] Ap[f] > j) + m 4 [f] > \chq[i,j]\ + rd[j,i] + \chp[j,i]\) 

Here, | chq [i, j] \ and | chp [j, i] \ denote the sizes of the buffers of these asynchronous 
channels. This property states that the sum |c/ig[z, j]| + rd[j, i] + \ chp[j, z]| never 
exceeds 1 , and can be positive only if process M[i] is at location rm or at locations 
TO31.32 with p[i] > j. 

The Reply Counter The role of the counter c[z] is to count the number of 
positive replies Node[i] received since it last sent out requests for entering the 
critical section. We would expect that, at any point, the value of c[z] will equal 
the number of pending replies. This is stated by 

PROPERTY c_range: □(c[z] = 

N 

\chq[i,k]\ + rd[k,i] + \chp[k,i]\) + m3i,32[i] • {N - p[i] + {p[i] > i)) 



Neither of properties message_chain or c_range is inductive by itself. How- 
ever, their conjunction, to which we refer as msg_range_coun- ter, together 
with the previously established property p_range form an inductive assertion. 

The Value of a Request Message As the last bottom-up invariant, we 
formulate the following property: 

PROPERTY request_in_channel: 

□ (|c/ig[z, j]| >0 — > head{chq[i, j]) = osn[i\) 

This property states that if channel chq[i,j] is not empty, then the value it 
contains is the current value of osn[i]. 



3.2 Top Down Assertions 

We now move to a set of assertions which are derived based on the goal we wish 
to prove, namely the property of mutual exclusion. 

We start by introducing some definitions: 

requested{i, j) : i j (TO4.5W V (w3i,32[z] A p[z] > j)) 

request -received (i, j) : requested{i, j) A |c/mj'[z, j]| = 0 
gr anted {i, j) request .received (i,j) A ^rd[j,i] 
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Variable hsn Retains the Highest Message Number Seen So Far 

The following property states that, after having read the recent message from 
Node[i], the variable hsn[j] (“highest seen”) has a value which is not lower than 
osn[i]. 

PROPERTY hsn_highest: 

\Zl{request-received{i, j) osn[i] < hsn[j\) 



The Implication of Node\j] Granting Permission to Node\i\ The 

following property describes the implications of a situation in which Node [j] has 
granted an entry permission to Node[i] before Node[i\ exited its critical section: 

PROPERTY permitted: 

^{granted{i,j) ^ ^rcs[j] V (osn[i],i) ^ {osn[j]J)) 



Finally, Mutual Exclusion Finally, we establish the property of mutual ex- 
clusion, specified by 

PROPERTY excl: □(m5[i] A m^lj] i = j) 



{, ^ l=j) ] 


h 1 


1 


permittsd □ (granted(ij) ^ ^rcs[j] V (osn [i], i) K (o3n |j],j))) 


razP] J 


,™2D] 1 

] 


han highest □ (requ63t_Eeceived(i,j) 03 n[i]<han[j])) 


request_in_channel □ (|chq[i,j] | >0 ^ head(chq[i,j])=oan[i]) 


■Tl 31,32 [l] , 

f 


1 

1 


mag.range.counter 

□ (m Ji]-|- (m3, 32 [1] A p[i] >j)> |chq[i,j]|-|-rdp,i] + |chp[j,i]| 

A c[i] = ( S |chq[i,l]|-|-rd[l,i]-|-|chp[l,i]|) -|-m3,33[i].(N-p[i]-|-(p[i]>i))) 




( p.range □ (t ^ p[i] A p[i]^ N +1 - m33,3 [i] ) ' 


[ rcamange □ (mj,, 32,4,5.3 [i] reap] ) ^ 



Fig. 2. Set of inductive properties leading to the proof of Mutual Exclusion. The 
labels on the dependence edges identify the transitions for which the verification 
of the higher placed property depends on the lower property. 
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4 A Proof Rule for Accessibility 



Rule P-WELL 



Wl. p 



For i = l..m 



For assertions p and q = Pq, ipi[k], . . . , 
transitions Ti[k], ...,Tm[k] € J 
a well-founded domain (A, and 
ranking functions i5o, i^i [fc], ■ • ■ , : A! i— > A 

m 

> Y 3fc : <p^[k] 
i=o 



W2. p^[k]^‘p^[k] 

W3. p^.[k] A tpjfc] 
tk4. t/gjfc] 



Y 3u : ((p' [m] a Si[k] >- (5' [m]) V {ip'^[k] A Si[k] = 6'^[k]) 
i=o 

for every t G T 

m 

Y A S'j[u]) 

j=o 

En{Tt[k]) 
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To verify liveness properties of parameterized programs, we can use a fixed 
number of intermediate formulas and helpful transitions but they may refer to 
an additional parameter k which is a process index. A parameterized rule for 
proving accessibility properties of parameterized systems has been presented in 
[MP95b]. However, to verify a complicated system such as the RA algorithm, it 
was necessary to introduce a new version of this rule, which we present here. 

To improve readability of formulas, we write p^. as p^. [k] . Rule p-well uses 
parameterized intermediate assertion, parameterized helpful transitions, and pa- 
rameterized ranking functions. For each i = l,...m, the parameter k in Pi[k], 
Ti[k], and 6i[k] ranges over some nonempty set, such as [l..A^]. 

The rule traces the progress of computations from an arbitrary p-state to 
an unavoidable g-state. With each non-terminal assertion z > 0, the rule 
associates a helpful transition Ti, such that the system is just (weakly fair) with 
respect to Ti. Premise W2 requires that the application of any transition t to 
a state satisfying a non-terminal assertion will never cause the rank of the 
state to increase. Premise W3 requires that if the applied transition is helpful 
for pj then the rank must decrease. Due to the well-foundedness of the ranking 
functions, we cannot have an infinite chain of helpful transitions, since this will 
cause the rank to decrease infinitely often. Premise W4 stipulates that the helpful 
transition Ti is always enabled on every p-state. Thus, we cannot have an infinite 
computation (which must be fair) which avoids reaching a, q = Pq state. 



4.1 Representation by Diagrams 

The paper [MP94], introduced the graphical notation of verification diagrams. 
For our application here, verification diagrams can be viewed as a concise and 
optimized presentation of the components appearing in rule p-well. We refer the 
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reader to Figure 3 for explanation of some of the main elements which typically 
appear in such diagrams. 

The diagram contains a node for each assertion that appears in the rule. 
The helpful transition taui associated with is identified by the label of one 
or more directed edges departing from the node (labeled by) p^. Thus, in the 
diagram of Figure 3, m 2 [z] is identified as the transition helpful for assertion 
and the helpful transition for p^^ is qi[z,i\ even though it labels two edges 
departing from p^^. 

The interconnection topology in the diagram provides a more specialized 
(and efficient) version of the p-well rule. For a node p ^, let succ{i) be the set 
of indices of the nodes which are the targets of edges departing for p^. Then the 
diagram suggests that, instead of proving premises W2 and W3 as they appear 
in the rule, it is sufficient to prove their following stronger versions: 

C/2. A Pi[k] pi[k] A Si[k] = S[[k] for every t eT 

U3. p^.[k] Api[k] ^ V 3u : {p'j[u] A Si[k] A S'j[u]) 

j^succ{i) 

It is not difficult to see that U2 and U3 imply W2 and W3. For example, 
premise U3 for assertion p^ as implied by the diagram is 

‘PeW Pm72 W ^ (/"tW a 5^[i]) V {p^[i] A A 

The more general notion of verification diagrams as presented in [MP94] admits 
two types of edges, one corresponding to the helpful transitions, which are the 
edges present in our diagram. The other type corresponds to unhelpful transi- 
tions. It is suggested there to use a double line for helpful edges. In our case, we 
only need to represent helpful transitions, so we draw them as single lines. 

The rule also requires to associate with each non-terminal assertion p ^ a 
ranking function 5i. By convention, whenever a ranking function is not explicitly 
defined within a node p^, the default value is the index of the node, i.e. 5i = i. 
For example, in the diagram, i5i3 = 13. However, as we will see below, this is not 
the end of the story. 

4.2 Encapsulation Conventions 

The diagram of Figure 3 contains, in addition to basic nodes such as those la- 
beled by assertions, also compound nodes which are also called blocks. We may 
refer to compound nodes by the set of basic nodes they contain. For exam- 
ple, the successor of node p^^ is the compound node P 2 ,i 32- Compound nodes 
may be annotated by A-declarations, such as the compound node p^ ^j, by addi- 
tional assertions, such as m 4 \y\ for block p^ 7, or ranking components, such as 
(6, —p[i\) for block p^ 7. There are several encapsulation conventions associated 
with compound nodes. 

— An edge stopping at the boundary of a block, is equivalent to individual 
edges which reach the basic nodes contained in the block. Thus, both 
and Pi 2 are immediate successors of node p^^. 
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— For each basic node lp^, the full assertion associated with this node is the 
conjunction of all the assertions labeling the blocks in which the basic node 
is contained. We denote this full assertion by For example, 

^7 = m 4 [z] A (\/j :\chq[z,j]\ = 0) A m 4 [y] A rd[i,y] A mn[i] 

— For each basic node the full ranking function associated with this node 
is the left-to-right concatenation of all the ranking components labeling the 
blocks in which the basic node is contained. As the rightmost component, 
we add i. For example, 

<5? = {^,-osn[y],-y,4:,c[y],6,-p[i],7) 

In Figure 3 we present the full ranking functions for each of the nodes along- 
side the diagram. 

Note that whenever we have to compare ranking functions which are lexico- 
graphic tuples of different lengths, we add zeroes to the right of the shorter one. 
For example, to see that <5i3 S 12 , we confirm that (13,0,0) A (11,— 12). 

Note also that several components of the ranking functions are negative. 
When STeP is presented with any ranking function, one of the proof obliga- 
tions which are generated require proving that all components are bounded from 
below. This has been done for all the components present in the diagram. 



5 Proof of Accessibility Property 



The property of accessibility may be written in the form 
PROPERTY m2m5: m 2 [z] m^lz] 
where z G [l..A^]. 



5.1 Auxiliary Assertions Needed for the Proof 



ismini osn, res 



,y) 



A crucial part in the proof is the computation of the index of the process y with 
minimal signature. We define 

y=l A Vj : ^rcs[j] 

V rcs[y] A Vj : {rcs[j] {osn[y],y) A {osn[j],j)) 

Thus, y has a minimal signature, if either there is no process j with rcs[j] 
and then y = 1, or there exists some j with rcs[j] = 1 and y is such a j with the 
minimal signature. In fact, rather then defining the function mini explicitly, we 
inform STeP of the following axiom: 



AXIOM mini: ismin{osn,rcs,mini{osn,rcs)). 
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There were several auxiliary assertions whose invariance was necessary in order 
to establish the proof obligations generated by STeP, when being presented by 
the verification diagram of Figure 3. We list them below: 

rd_osn : rd[i,j] (osn[i],i) ^ {osn[j], j) 

not_rd_range : mi,2[f] V (m7i,72[t] Ap[z] > j) ^rd[i,j] 

y_eq_mini : y = mini{osn, res) 

y_is_min : ismin{osn, res, y) 

rd_to_y : rd[i,y] W7y72[z] A < y 

y_not_change : (Vj : |c/ig[z, j]| = 0) A m4[z] A TO2 [s] ^ 

{osn[y],y) A {hsn[s] + l,s) 

The last property y_not .change is very crucial in order to establish that y 
can stop being minimal only by retiring on exit from mQ[y]. In particular, no 
newcomer s can execute transition m2[s] and become minimal. 



i"3i[d 



( ipi3 : nri2[z] ~ ) 
5:(11,-p[z]) 



inri 4 [z] 



qih.i] 



<Pi2 :tn3i[z] ) 

n 3z[z ]t ni3,[z]| 

( cpll :tlfl 32 [z] ) 



).i:[1..N] 6:(10,SMchq[z,l]|)'' 
— C tpm:|chq[z.i]|>0~} 



ismin(osn,rcs,v) A Vj : |chq[z3j]|=0 6: (1,-osn[y] , -y ) 

^6:(8.-p[y]) ^ 



ni3i[y] 



CP9 :iH3i[y] ) 
nn32[y ]| nri3i[y]t 

( :ilfl32[y] ) 



m4[y] 



riD.y] 



'' Xi:[1..N] S:(4,c[y]) 


r rd[i,y] 6:(S,-p[i]) 4 




C :|chq[y,i]|>0 } 


C : mv,[i] ) 




lilV.ilf m 


m7i[i]t H72[i]| 




=f (p 4 :|chp[i,y]|>0 


<■ (p6:mv2[i] I) 




^ II 


^ ^ 








q>3 : C[y]=0 



nri4[y]|^ 

( tp2 : tnsM ) 

C <Pi : tneM > 



mglyL 



6,3: 13 

5 , 2 : (11,-p[z],12) 

5„: (11,-p[z],11) 

6,0 : (10,E^|chq[Z,l]|) 

6g: (1,-osn[y],-y,8,-p[y],9) 

Sg: (1,-osn[y],-y,8,-p[y],8) 

5,: (1,-osn[y],-y,4,c[y].6.-p[i],7) 
Se : (1 ,-osn[y],-y,4,c[y],6,-p[i],6) 
65: (1,-osn[y],-y,4,c[y],5) 

64: (1 -osn[y],-y,4,c[y],4) 

63: (1,-osn[y],-y,3) 

& 2 - (1,-osn[y],-y,2) 

6i: (1,-osn[y],-y,1) 

: 0 

here ( at m4[y] ) 

« M=E“ 1 1 cli<l[V.I] l+i'4[I.V]+|chp[l,y]| 



i"4[y] ■ 



• < tpo : >"5[d 



Fig. 3. Verification Diagram for the property m2 [z] =4> ^ 7775 [z] . 
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5.2 Usage of STeP in the Proof 

We used STeP version 1.4 from 2/XI/1999 in our proof. Some modifications 
of the source program were necessary in order for STeP to accept our SPL 
program. This version of STeP also fails to support lambda-blocks in the way 
there were used in Figure 3. To overcome this difficulty, we had to feed STeP 
with some processed fragments of this diagram and then modify manually some 
of the resulting verification conditions. We hope that future versions of STeP 
will provide direct support of lambda-blocks. 

In spite of these minor inconveniences, we found STeP to be a very powerful 
and useful verification system, specially geared to the temporal verification of 
complex algorithms such as the Ricart Agrawala algorithm we considered here. 
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Abstract. We consider a weak version of pseudorandom function gener- 
ators and show that their existence is equivalent to the non-learnability of 
Boolean circuits in Valiant’s pac- learning model with membership queries 
on the uniform distribution. Furthermore, we show that this equivalence 
holds still for the case of non-adaptive membership queries and for any 
(non-trivial) p-samplable distribution. 



1 Introduction 

In computational learning theory, many non-learnability results for (the rep- 
resentation independent version of) Valiant’s pac-learning model are based on 
cryptographic tools and assumptions. Already in [19], Valiant pointed out how 
the results of [8] can be used to show that (arbitrary) Boolean circuits are not 
efficiently pac-learnable if cryptographically secure one-way functions exist. Re- 
lated results can be found in, e.g., [11, 3, 1, 12, 13]. 

On the other hand, it is known [4] that the non-learnability (in polynomial- 
time) of Boolean circuits provides a sufficient condition under which TZV and, 
hence, V is different from MV. Recently, Impagliazzo and Wigderson [9] showed 
that every problem in BW can be solved deterministically in sub-exponential 
time on almost every input (for infinitely many input lengths), provided that 
£XV 7 ^ BW. Since £XV = BW implies NV = TIV, this means that the 
non-learnability of Boolean circuits can further serve as a hypothesis to achieve 
derandomization results for probabilistic polynomial-time computations. 

Interestingly, both implications, namely that Boolean circuits are not learn- 
able (if one-way functions exist) as well as the derandomization of BW (if 
Boolean circuits are not learnable) are based on the concept of pseudorandom 
generation, i.e., the possibility to expand a small number of random bits (also 
known as the seed) into a large amount of pseudorandom bits which cannot 
be significantly distinguished from truely random bits by any polynomial-time 
computation. An important difference between these two implications is, how- 
ever, that in the first case the pseudorandom generator has to run in polynomial 
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time, while for the derandomization of BW it suffices that the pseudorandom 
generator runs in exponential time. 

A close connection between pseudorandom generation and non-learnability 
has been demonstrated already in [8], where the existence of a pseudorandom 
function generator (see Section 3) is shown to be in fact equivalent to the non- 
learnability of Boolean circuits. This, however, only holds if we consider learning 
agents that are successful with high probability for a randomly chosen target (in- 
stead of being successful for all targets as in the standard model of pac- learning). 
Furthermore, in the model of [8], the learning task has to be accomplished with 
the help of membership queries and with respect to the uniform distribution. 
Recall that the original model of [19] is distribution-free and passive, i.e., the 
learning algorithm has to be successful for any (unknown) distribution T>, and 
access to the target is given via random examples chosen according to T> together 
with their classification by the target. Further connections between the average- 
case model of pac-learning and cryptographic primitives have been found in [2] . 

Recently [14], also the worst-case model of (passive) pac-learning under the 
uniform distribution has been closely related to pseudorandom function gener- 
ation. In contrast to [8] the pseudorandomness condition in [14] is expressed 
in terms of the worst-case advantage of the distinguishing algorithms (see sec- 
tion 3). Moreover, the distinguishing algorithms for the function generator in [14] 
are required to be passive, i.e., the distinguishing algorithm may access its oracle 
only via random classified examples. This leads to an apparently weaker notion 
of pseudorandom function generation as compared to the standard definition [8] . 

Here we take a similar approach as in [14] for pac-learning with membership 
queries under the uniform distribution [1, 12, 10]. As our main result we show 
that the non-learnability of Boolean circuits is equivalent to the existence of 
a weak pseudorandom function generator, where as in [14] the pseudorandom- 
ness condition is expressed in terms of worst-case advantage of distinguishing 
algorithms. Further we show that this kind of weak pseudorandom generation is 
still strong enough to yield a derandomization result for TZV. In contrast to the 
above mentioned derandomization for BW, the function generator that we use 
is polynomial-time computable (instead of exponential time) . Hence, we even get 
that every problem in TZV can be solved nondeterministically in polynomial-time 
on almost every input (for infinitely many input lengths), where for every given 
e > 0, only n'^ many nondeterministic bits are used. Of course, this implies that 
every problem in TZV can also be solved deterministically in sub-exponential 
time on almost every input. 

As an application, we get that every learning algorithm for Boolean circuits 
which uses membership queries and which is successful for some arbitrary p- 
samplable distribution can be transformed into a learning algorithm which uses 
only non-adaptive membership queries and is successful for the uniform distribu- 
tion. Thus, if Boolean circuits are not learnable with non-adaptive membership 
queries under the uniform distribution, then Boolean circuits are not learnable 
with adaptive membership queries under any non-trivial [12] p-samplable distri- 
bution. 
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2 Preliminaries 

We use Bn to denote the set of all Boolean functions / : {0,1}” ^ {0,1}- 

A probability ensemble V = {!?„ : n > 0} is a sequence of probability distri- 
butions T>n on {0, 1}”. The ensemble T> is p-samplable if there is a polynomial 
time computable function / and a polynomial p such that for all n, f{X) is 
distributed according to when X is uniformly distributed over (0, 

For a polynomially bounded function / : IN ^ M, let AfV[f] [7] denote the 
class of all sets A G AfV for which there exists a set B G V such that for all 
strings x, x G A ^ 3y G {Q, 1 } 170^1)1 : {x,y) G B. For a function e : IN — > [0, 1] 
and any class C of sets, 7feurg(„)C denotes the class of all sets A such that for 
any p-samplable distribution ensemble T> there is a set A' G C such that for all n 
and for X Gv„ (0, 1}”, Pr{A{X) ^ A' {X)) < e{n). Here, A{x) is used to denote 
the characteristic function of a set A. 

Let X and Y be independent and identically distributed random variables on 
{0, 1}”. The Renyi entropy of X is entflen(A) = — log(Pr {X = F)), and the min- 
imum entropy of X is entmm(A) = min{— log(Pr (A = a;) : a: G (0, 1}”}. Note 
that for any random variable X defined on (0, 1}”, entfle„(A)/2 < entmm(Al) < 
ent }ien{X). The Renyi entropy of a distribution T> is the Renyi entropy of a 
random variable X chosen according to T>. 

2.1 Predictability 

We recall the learning model of efficient prediction with membership queries 
(cf. [1, 12]). A representation of concepts C is any subset of (0, 1}* x (0, 1}*. A 
pair {u,x) of (0, 1}* x (0, 1}* is interpreted as consisting of a concept name u 
and an example x. The concept represented by u is kc{u) = {a; : (u,x) G C}. 

A prediction with membership queries algorithm, or pwm- algorithm, is a pos- 
sibly randomized algorithm A that takes as input two positive integers s and n, 
where s is the length of the target concept name and n is the length of exam- 
ples n as well as a rational accuracy bound e. It may make three different kinds 
of oracle calls, the responses to which are determined by the unknown target 
concept c = kc{u) with juj = s, and an unknown distribution on {0, 1}” as 
follows. 

1. A membership query takes a string x G (0, 1}* as input and returns 1 if a; G c 
and 0 otherwise. 

2. A request for a random classified example takes no input and returns a pair 
(x,b), where a: is a string chosen according to and 6 = 1 if a: G c and 
6 = 0 otherwise. 

3. A request for an element to predict takes no input and returns a string x 
chosen according to U„. 

The algorithm A may make any number of membership queries or requests for 
random classified examples. However, A must eventually make one and only one 
request for an element to predict and then eventually halt with an output of 1 
or 0 without making any further oracle calls. 
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A pwm-algorithm A is said to e-predict kc{u) on if on input s = |m|, n 
and e, when A is run with respect to the target concept kc(u) and distribution 
T>n, the probability is at most e that the output of A is not equal to the correct 
classification of X by kc{u), where X {0, 1}” is the request for an element 
to predict. Furthermore, a pwm-algorithm A is said to run in polynomial time if 
its running time is bounded by a polynomial in s, n, and 1 /e. 

Definition 1. Let C he a representation of concepts, and let V = {!?„ : n > 0} 
he a prohahility ensemhle. Then C is predictable on V with membership queries 
in polynomial time if there exists a polynomial-time pwm-algorithm A such that 
for all positive integers s and n, for all concept names u of length s, and for all 
positive rationals e, A e-predicts kc{u) on 

Recall that in the weak model of learning [11] the learning algorithm has to 
predict the target only slightly better than a completely random guess. 

Definition 2. Let C he a representation of concepts, and let V he a prohahility 
ensemhle. Then C is weakly predictable on V with membership queries in polyno- 
mial time if there exists a constant c > 0 and a polynomial-time pwm-algorithm 
A such that for all positive integers s and n, and for all concept names u of 
length s, A -predicts kc{u) on 

In the distribution- free model of learning, weak and strong learning are shown 
to be equivalent in [18]. Based on Yao’s XOR lemma [20], a similar result is shown 
in [5] also for learning on the uniform distribution. 

Theorem 1. [5] For every representation of concepts C G V there exists a 
representation of concepts C C such that if C is weakly predictable on 
the uniform distribution (with membership queries) in polynomial time, then 
also C is predictable on the uniform distribution (with membership queries) in 
polynomial time. 

We also consider predictability with non-adaptive membership queries, where 
by a non-adaptive membership query we mean a query that does not depend on 
the responses to previous queries. It may, however, still depend on the random 
coin tosses used by the prediction algorithm. Note that Theorem 1 holds also 
for non-adaptive membership queries. 

Recall that in the pac-learning model the learning algorithm is required 
to output a concept name which approximates the target well rather than to 
guess the correct classification of the target by itself. For boolean circuits, how- 
ever, polynomial-time pac-learnability and predictability coincide. Furthermore, 
if there exists some representation of concepts C G V that is not predictable in 
polynomial time then boolean circuits are not predictable in polynomial time. 

3 Weak Pseudorandom Generators 

In this section we introduce our weak version of a pseudorandom function gener- 
ator. But first let us recall the standard definition. Let / : {0, 1}*(”) x {0, 1}” ^ 
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{0, 1} be a polynomial-time computable function ensemble. Note that the poly- 
nomial-time computability of / implies that I is polynomially bounded. For a 
fixed string x of length l{n) we can view f{x, •) as a function G Bn that is gen- 
erated by / on the seed x, and therefore we refer to / also as a function generator. 
Now / is a pseudorandom function generator if the function fx produced by / 
for a random seed X Gu {0, cannot be significantly distinguished from a 
truely random function F Gu Bn by any polynomial-time computation, i.e., for 
all probabilistic polynomial-time oracle algorithms T, for all positive integers c, 
and for infinitely many integers n, the success 

S{n) = I Pr (T^^(0") = l) - Pr (T^(0") = l) | 

of T for / is less than ^ . The algorithm T is also called a distinguishing algorithm 
or test for /. Our definition of a weak pseudorandom function generator is based 
on the advantage of a test T for / (cf. [14]) which is defined with respect to a 
fixed seed x G {0, as 

e{x) = Pr (T^-(0”) = l) - Pr (T^(0”) = l) . 

Note that the success of T for / can be expressed by the average advantage of 
T for / as S(n) = | E (e(Jf)) | for a random seed X Gu {0, 1}^^”^. So we refer to 
e(n) = min{e(a:) : x G {0, !}*(")} as the worst-case advantage of T for /. 

Definition 3. A weak pseudorandom function generator is a polynomial-time 
computable function ensemble f : {0, x {0, 1}" ^ {0, 1}, such that for all 
probabilistic polynomial-time oracle algorithms T, for all positive integers c, and 
for infinitely many integers n, the worst-case advantage of T for f is less than 

j_ 

As a first property let us mention that a weak pseudorandom function genera- 
tor is still useful for derandomizing probabilistic polynomial-time computations. 
We omit the proof. 

Proposition 1. If there exists a weak pseudorandom function generator, then 
for any constant c > 1, TZV C io-7Leur„-cA/'P[n^/'^]. 

As opposed to a pseudorandom generator, which has to expand its seed only 
by at least one bit, we can consider a pseudorandom function generator as a 
function which expands a seed of polynomial length into a pseudorandom bit- 
sequence of length 2”. In the standard setting it is well known that a pseudo- 
random generator can be used to construct a pseudorandom function generator 
[8]. It is thus an immediate question whether a similar fact can also be shown 
with respect to worst-case advantage. Unfortunately, we are not able to answer 
this question completely. We can show, however, that there exists a weak pseu- 
dorandom function generator if there exist weak pseudorandom generators with 
an arbitrary polynomial expansion. 

The proof is based on universal hashing, an ubiquitous tool in cryptography. 
A linear hash function h from {0,1}" to {0,1}™ is given by a Boolean n x m 
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matrix (oy), (or, equivalently, by a mn-bit string . . . ai^m ■ ■ ■ o,n,i ■ ■ ■ <in,m) 
and maps any string a; = to a string y = yi . . . ym, where yi is the inner 

product Gi • X = ^ij^j (mod 2) of the i-th row and x. In [6] it is shown 

that the set of all linear hash functions from {0, 1}" to {0, 1}™ is universal, i.e., 
for all n-bit strings x and y with x ^ y, Pr{H{x) = H{y)) = when H is 
uniformly at random chosen from the set of all linear hash functions from {0, 1}" 
to {0, 1}™. Now let us say that a hash function h from {0, 1}” to {0, 1}™ causes 
a collision on a set Q C {0, 1}" if there exist two different strings x and y in 
{0,1}™ such that h{x) = h{y). Otherwise, h is said to be collision-free on Q. 
Then a random linear hash function causes a collision on Q with probability at 
niost 

m 

Lemma 1. Suppose that there exists a family {gk ■ k > 1} of function ensembles 

k 

gk ■ {0, 1}” ^ {0, 1}" with the following properties: 

1. The i-th hit of gk{x) is computable in time polynomial in |a;| and k. 

2. For all positive integers k and c, for all probabilistic polynomial-time algo- 
rithms T, and for infinitely many integers n, there exists some x G (0, 1}" 
such that for Y {0, 1}”*"; 

Pr(T(5(x)) = l)-Pr(T(y) = l)<4- 

m 

Then there exists a weak pseudorandom function generator. 

Proof. Based on the family {gk ■ k > 1} we define a function generator / as 
follows. Given binary strings x and z of length n, and further two binary strings 
k and h of length log n and log n, respectively, we think of k as encoding an 
integer k G and of ft- as a linear hash function mapping strings of 

length n to strings of length m = [log n^J (by using only the first nm bits of 
the string ft). Then we can interpret the m-bit string h{z) as a positive integer 
h{z) G n^l, and define 

f{kohox,z) = gk{x){h(z)}, 

where gk{x)^k{z)} denotes the ft(z)-th bit of the n^-bit string guix). 

By the computability condition on the family {gk '■ k > 1} it follows that / 
is polynomial-time computable. To see that / is a weak pseudorandom function 
generator, assume to the contrary that there exist some c > 1 and a probabilistic 
oracle test T whose running-time on input 0” and with any oracle in Bn is 
bounded by n'^, and such that for all (sufficiently large) integers n, the worst- 
case advantage of T for / is at least Note that this implies that for all 
k G {0, and x G (0, 1}”, and for random H (0) and F £u ^n, 

Pr (-y/»=ouox(o") = 1) _ Pr (T'^(0") = l) > ^. 

k 

Now fix ft = 4c and consider the test T' for gk which on input y G {0,1}" 
works as follows. First choose ft Gu {Oj 1}” *°s”. Then simulate T on input 0", 
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where each query 2 : G {0, 1}” is answered by the h{z)-th bit of y. Finally accept 
if and only if T accepts. 

Under the condition that the hash function h chosen by T' is collision-free on 

k 

the set of queries Q asked by T, the test T' accepts a random Y Gu {0, 1}” with 
the same probability as T accepts with a random oracle F Gu l^n- Even though 

k 

the queries in Q might depend on h for some specific y G {0, 1}” , it is not hard 
to see that for the random Y and H, the probability that H is collision-free on 
Q coincides with the probability that F[ is collision-free on the set of queries 
asked by T when T is run with the random oracle F. It follows that H causes a 
collision on Q with probability at most 

2™ - 2™ 2n=’ 

where the last inequality follows by the choice of A: = 4c which implies that 
2™ > n^/2 = n"'^°/2 > 2n^°, and hence we have 

|Pr(T'(y) = l)-Pr(T^(0") = l)|<^. 

On the other hand, for all x G {0,1}”, the probability that T' accepts gk{x) 
is just the probability that T accepts with oracle fkoHox for a random hash 
function FI G {0, i}"^iog"-. it follows that for all x G (0, 1}”, 

Pr{T\gk{x)) = 1) - Pr(T'(U) = 1) > 

But this contradicts the pseudorandomness condition of the lemma. □ 



4 Weak Pseudorandomness versus Predictability 

In this section we show that there exists some non-predictable representation of 
concepts C G V ii and only if there exists a weak pseudorandom function gen- 
erator. The implication from non-predictability to the weak function generator 
is based on the (well-known) construction of a pseudorandom generator due to 
Nisan and Wigderson [17]. 

Definition 4. A (r, I, n, fc)-design is a collection S = {Si, . . . , Sr) of sets Si C 
{1 , each of which has cardinality n, such that for all i yf j, H Syl < k. 
Given a function f : (0, 1}" ^ (0, 1}, the Nisan- Wigderson generator (based 
on / and S), f^ : (0, 1}* ^ (0, 1}’’, is for every seed x = x\ . . . xi of length I 
defined as 

fix) = f{xs,) ...fixsr), 

where xs,, for 1 < z < r, denotes the restriction of x to Si = {i\ < ... < i„} 
defined as xs, = xq . . . . 
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It is shown in [17] that for all positive integers r and n we can use polynomials 
of degree at most log r over a suitably chosen field to construct a (r, 4n^, n, log r)- 
design S = (S'!, . . . , Sr) such that each Si can be computed in time polynomial 
in n and logr [17]. In the following, we refer to this design as the low-degree 
polynomial design. 

Furthermore, as shown in [17], is a pseudorandom generator with respect 
to non-uniformly computable distinguishing algorithms, provided that the func- 
tion / is hard to approximate by polynomial-size circuits. This means that if 
there is a polynomial-size circuit T with sufficiently large distinguishing suc- 
cess for , then there is a polynomial-size circuit T' that approximates /. It is 
known that in fact T' can be uniformly obtained from T, though at the expense 
of polynomially many membership queries to / (cf. [2, 9]). By inspecting the 
proof in, e.g., [9] it is not hard to see that the required membership queries do 
not depend on the answers to previously asked queries, i.e., T' can be obtained 
from T by using only non-adaptive queries to /. 

Lemma 2. [17, 2, 9] There exists a probabilistic oracle algorithm A which given 
as input an integer n, a circuit T with input length r, a rational e > 0, and further 
access to a function f : {0, 1}” ^ {0, 1} computes with probability at least 1 — e 
a circuit T' such that for Z €u {0, 1}”, 

Pr(T'(Z) = /(Z))>i + ^-e, 

provided that for X Gn {0, 1}"^" , Y Gjj {Oj and the low-degree polynomial 
(r, ,n,logr)-design S, 

I Pr {T{f{X)) = 1) - Pr(T(F) = 1) | > <5. 

Moreover, A runs in time polynomial in n, |T| and 1/e, and A asks only non- 
adaptive oracle queries to f. 

Now, based on Lemmas 1 and 2 we can construct a weak pseudorandom 
function generator under non-predictability assumption. 

Theorem 2. If there exists a representation of concepts C G V that is not 
weakly predictable with non-adaptive membership queries on the uniform distri- 
bution in polynomial time, then there exists a polynomial-time computable weak 
pseudorandom function generator. 

Proof. Let l{n) = 4n^. W.l.o.g. we assume that also the l{n)-size bounded re- 
striction of C defined as Cpn) = {{u,x) G C : \u\ = ^(|a;|)} is not weakly 
predictable with non-adaptive membership queries on the uniform distribution 
in polynomial time. (Otherwise this can be achieved by a simple padding argu- 
ment.) Based on Cpn) we define a family {gk : A: > 1} of function ensembles 
gk '. {0,1}^^*^"^ ^ {0,1}”*" satisfying the conditions of Lemma 1. The theorem 
will follow. 
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For all positive integers n and k with k < n, and for all binary strings u and 
X of length I (n) define 

gk{uox) = u^{x), 

where Un denotes the characteristic function of the set kc(u) n {0, 1}”, and 
S = (Si, . . . , S„k) is the low-degree polynomial (n^, 4n^, n, log n^)-design. For 

k 

k > n, put gk(u o x) = 0" . 

Since C & V, and since each Si for i = 1, . . . , can be computed in time 
polynomial in n and logn^, it follows that the family {gk ■ k > 1} satisfies the 
computability condition of Lemma 1. 

To see that the family {gk ■ k > 1} also satisfies the pseudorandomness 
condition of Lemma 1, assume to the contrary that there exist positive integers 
k and c and a polynomial-time computable test T, such that for all sufficiently 

k 

large n, for all binary strings u and x of length l(n), and for Y £u {0, 1}” it 
holds that 

Pr (T(gk(u o x)) = 1) - Pr (T(Y) = 1) > ^. 

Note that this implies that the test T distinguishes the random variable gk(UoX) 
with U €u {0, and X Gu {0, from a uniformly distributed Y Gu 
{0, 1}"*’ by at least 

Now consider the pwm-algorithm A for which on input s = l(n) and n 

and with respect to some concept kc(u) with |u| = l(n) works as follows. First 
obtain a probabilistic circuit T„fc that computes T on input length by using a 
(finite) description of the Turing machine computing T . Then run the algorithm 
of Lemma 2 with the circuit , oracle and parameter e = to obtain 

a circuit T' . Finally request an element 2 to predict, and answer with T'(z). 

By the assumption on the test T, the algorithm of Lemma 2 produces, for 
all n and concept names u of length l(n), and with probability at least 1 — e, a 
circuit T' satisfying 

Pr(T'(Z) = „„(Z))>l + ^-e 

for Z Gu {0, 1}”. This implies that ^’s guess for the classification of Z by kc(u) 
is correct with probability at least 

11^1 1 
2 ^ ~ 2 2n°+'= ’ 

Further note that A makes no additional membership queries except the non- 
adaptive queries to kc(u) required to produce the circuit T' . Thus, C/(„) is 
weakly predictable with non-adaptive membership queries on the uniform dis- 
tribution in polynomial time. This, however, contradicts our assumption on C, 
and hence, gk is a weak pseudorandom generator for all fc > 1. The theorem now 
follows from Lemma 1. □ 

We now proceed to show the converse of Theorem 2 for any p-samplable dis- 
tribution, provided that the distribution in question does not immediately imply 
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a trivial prediction algorithm. Kharitonov [12] showed that any representation 
of concepts C is polynomially weakly predictable on any distribution (ensemble) 
T> = {T>n\ with Renyi entropy O(logn). This motivates the following definition. 

Definition 5. [12] A distribution ensemble V = {!?„} is trivial if for all n> 1, 
T>n has Renyi entropy O(logn). 

A function generator / : {0, x {0, 1}" ^ {0, 1} is associated in a natural 
way with the representation of concepts = {{x, z) : \x\ = l{\z\),f{x, z) = 1}. 
Note that for a string x of length l{n), A is iust the characteristic function of 
the set Kc{x) n {0, 1}". 

Theorem 3. If f : {0, l}*(”^x{0, 1}” ^ {0, 1} is a weak pseudorandom function 
generator, then is not weakly predictable with membership queries on any 
non-trivial p-samplable distribution in polynomial time. 

Proof. Assume to the contrary that there exists a non-trivial p-samplable distri- 
bution T> such that is weakly predictable with membership queries on T> in 
polynomial time, and let A be a pwm-algorithm and c be a positive integer such 
that for all n > 1, the running-time of A on inputs s = l(n) and n is bounded 
by n'^, and such that for all strings x of length l{n), A e-predicts Kcf{x) on 
with e = 1 - 

We now use A to construct a test T for /. Given input 0" and an oracle 
h : {0, 1}” ^ {0, 1}, the test T simulates A with inputs n and s = l(n). When 
A makes a membership query z, then T answers this query with h{z). When A 
requests a random classified example or an element to predict, then T chooses a 
string 2 {Oj 1}” ^md returns either the example (z,h(z)) or the prediction 
challenge z to A. Finally, T accepts if and only if the guess of A on the prediction 
challenge z coincides with h{z). 

Obviously, for all positive integers n and for all strings x of length l{n), 
the test T accepts the oracle fx with probability at least | If on the 

other hand, the oracle for T is a random function F Gu Bn, then there are two 
possibilities. If the prediction challenge z coincides with some previously seen 
labeled sample, then we have to assume that A’s guess for z is correct. Since 
the running time of A, and hence the number of labeled samples it obtains (via 
example or membership queries) is bounded by n'^, and since I? is a non-trivial 
distribution, this can happen for a random prediction challenge Z Gx>„ {0, 1}” 
only with probability at most < n® • 

Otherwise, i.e. in the case where all previously seen examples are different from 
the prediction challenge z, the probability that A classifies z correctly is exactly 
I. It follows that T accepts the random oracle F with probability at most \ + ^- 
Thus, for all strings x of length l{n), 

Pr (r/"(0”) = 1) - Pr (T^(0") = l) > 

contradicting the assumption that / is a weak pseudorandom function generator. 

□ 
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Since Boolean circuits are not pac-learnable if and only if there exists a non- 
predictable representation of concepts C G V, we can combine Theorems 1, 2 
and 3 to get the following corollary. 

Corollary 1. The following are equivalent. 

1. There exists a weak pseudorandom function generator. 

2. Boolean circuits are not pac-learnahle with non-adaptive membership queries 
on the uniform distribution in polynomial time. 

3. Boolean circuits are not pac-learnable with membership queries on any non- 
trivial p-samplable distribution in polynomial time. 

As an application of Corollary 1 to the theory of resource-bounded measure 
[16], let us mention that it is shown in [15] that if V /poly does not have pi- 
measure zero , and furthermore £XV yf A4A, then Boolean circuits are not pac- 
learnable with non-adaptive queries on the uniform distribution in polynomial- 
time. By Corollary 1, this can be extended to pac-learnability with (adaptive) 
membership queries on arbitrary non-trivial p-samplable distributions. 

Furthermore, by Proposition 1, the non-learnability of Boolean circuits with 
membership queries on any non-trivial p-samplable distribution implies that for 
any constant c > 1, TZV C io-7feur„-cA/’7^[n^/'^]. 

Finally let us remark that the existence of a weak pseudorandom function 
generator can be expressed also in terms of (standard) success instead of worst- 
case advantage. But then we need to consider non-uniformly computable function 
generators and, moreover, the function generator might depend on the (still) uni- 
formly computable distinguishing algorithm. More formally, we say that a test T 
breaks a function generator / if for some positive integer c and for all sufficiently 
large n, the success of T for / is at least l/n°. Now a slight modification of the 
proof of Theorem 2 together with Theorem 3 yields the following equivalence. 

Corollary 2. The following are equivalent. 

1. There exists a weak pseudorandom function generator. 

2. There is some polynomial size-bound s{n) such that no probabilistic poly- 
nomial-time computable test can break every function generator computable 
by circuits of size s{n). 
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Abstract. In this paper we present an approach for proving ©f- 
completeness. There are several papers in which different problems of 
logic, of combinatorics, and of approximation are stated to be complete 
for parallel access to NP, i.e. © 2 “Complete. 

There is a special acceptance concept for nondeterministic Turing ma- 
chines which allows a characterization of ©f 8-® ^ polynomial-time 
bounded class. 

This characterization is the starting point of this paper. It makes a mas- 
ter reduction from that type of Turing machines to suitable boolean 
formula problems possible. From the reductions we deduce a couple of 
conditions that are sufficient for proving ©f-hardness. These new condi- 
tions are applicable in a canonical way. Thus we are able to do the fol- 
lowing: (i) we can prove the ©f-completeness for different combinatorial 
problems (e.g. max-card-clique compare) as well as for optimization 
problems (e.g. the Kemeny voting scheme), (ii) we can simplify known 
proofs for ©f-completeness (e.g. for the Dodgson voting scheme), and 
(iii) we can transfer this technique for proving Zif-completeness (e.g. 
TSPcompare). 



1 Introduction 

The complexity class was established as a constitutional level 

of the polynomial time hierarchy, e.g. Wagner in [Wag90] proved that 
Further characterizations of this complexity class are given by several authors: 
0P = pNP[iog] ^ lNP ^ 7^P (NP) = p[fP. 

Krentel [Kre88] has stated a characterization of the complexity class Zif = 
as a polynomial-time bounded class by the so called MAX-acceptance concept: 
Given a nondeterministic polynomial-time bounded Turing machine with out- 
put device M and an input x, then M accepts x in the sense of MAX iff any 
computation path with (quasilexicographically) maximum output accepts x. 

* Supported in part by grant NSF-INT-9815095/DAAD-315-PPP-gfi-ab. 



S. Kapoor and S. Prasad (Eds.): FST TCS2000, LNCS 1974, pp. 348—360, 2000. 
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This paper starts with a characterization of &2 by the following acceptance 
concept, introduced in Spakowski/ Vogel [SV99] : 

Let M he & nondeterministic polynomially bounded Turing machine with output 
device and let x be an input. M accepts x in the sense of MAX-CH iff x is 
accepted on any computation path (3 oi M on x with maximal number of mind- 
changes in the output. For w G {0, 1}*, ch(w) denotes the number of mind- 
changes in w. It holds ch(0) = ch(l) = ch(OO) = ch(ll) = 0,ch(10) = ch(Ol) = 1 
and e.g. ch(lOOlO) = 3, ch(lOlOl) = 4. 

This concept means that the internal structure of the output is essential. It 
allows a characterization of as a polynomial-time bounded complexity class: 
©2 = MAX-CH-P — which is in some sense analogous to = MAX-P. 

The theory of NP-completeness was initiated by Cook’s master reduction 
[Cook71], continued by Karp’s basic NP-complete problems [Karp72] and estab- 
lished by Carey/ Johnson’s guide to the the theory of NP-completeness [GJ79]. 
The aim of this paper is to give evidence that the theory of the class NP can 
be transcribed to the class , and thus the theory of 02 “<^O’^pl 6 teness becomes 
very canonical. We start our approach to that theory with the description of 02 
as a polynomial-time bounded class. That allows the construction of “master re- 
ductions” even to two different basic satisfiability problems of boolean formulas: 
MAX-TRUE-3SAT-C0MPARE: 

Given two 3-CNF formulas F\ and F 2 ', is the maximum number of I’s in satisfying 

truth assignments for F\ less than or equal to that for F 2 ? 

and 

0DD-MAX-TRUE-3SAT: 

Given a 3-CNF formula F] is the maximum number of I’s in satisfying truth 
assignments for F odd ? 

Of course, Wagner [Wag87] has provided a useful tool for proving ©^-hardness, 
and we state his result below as lemma 7. However, the “master reductions” 
stated in sect. 2.1 provide two conditions for proving 02-hardness (lemma 3 and 
lemma 5) such that (first) Wagner’s condition is a consequence of our condition 
and (second and more important) our condition is relatively simple to apply 
because we can make use of the classical constructions (see section 3). 

Section 3.1 summarizes the results for some basic combinatorial problems like 
min-card-vertex cover compare and max-card-clique compare (given two 
graphs we compare the sizes of the smallest vertex covers and largest cliques, 
respectively). The completeness of these problems in is a further argument 
for establishing this class. 

In section 3.2 we re-translate our method for the class and we are able to 
prove that TSPcompare (given two instances of traveling-salesperson we ask if 
the optimal tour-length for the first TSP instance is shorter than that for the 
second instance) is complete for A ^ — supplementing the list of Zif-complete 
problems given by Krentel. 

Of special interest are the applications mentioned in section 3.3. Voting schemes 
are very well studied in the social choice literature. Bartholdi/Tovey/Trick 
[BTT89] investigated the computational complexity of such problems. They 
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proved that the Dodgson voting scheme as well as the Kemeny voting scheme 
both are NP-hard. Hemaspaandra/Hemaspaandra/Rothe [HHR97] proved that 
Dodgson voting is 02 “Complete using Wagner’s lemma. The exact analysis of the 
Kemeny voting system was still an open problem. 

We are able, following our method, to give a simplification of the involved proof 
of [HHR97] as well as to prove that Kemeny voting is 0f -complete. The result for 
Kemeny voting is also stated in a survey paper of Hemaspaandra/Hemaspaandra 
presented at MFCS 2000 very recently [HHOO]. 

For the definitions and basic concepts we refer to the textbook [Pap94]. 

2 The Machine Based Technique 

2.1 Basic Problems Being Complete for 02 

We gave in [SV99] a characterization of the complexity class 02 by the so called 
MAX-CH acceptance concept: 

Given a nondeterministic polynomial-time bounded Turing machine M with 
output device and an input x, then M accepts x in the sense of MAX-CH iff on 
any computation path (3 oi M on x with maximum number of mind-changes of 
the output x is accepted. It turns out that 0f = MAX-CH-P, where MAX-CH- 
P is the class of all sets decidable in the sense of MAX-CH by polynomial-time 
bounded machines. 

A slight modification of these machines yields the following lemma: 

Lemma 1. For every A € 0| there are a NPTM M with output device and 
polynomials p and q such that the following is true: 

1. For any input x the output on every path (3 is of equal length < 7 (|a;|). 

2. For any input x every computation path (3 is of equal length p(|a:|). 

3. For any input x, two paths Pi and P 2 have the same acception behaviour 
whenever they have the same number of 1 ’s in the output. 

4- X G A if and only if M accepts x on Pmax, where Pmax is a computation 
path having the maximum number of 1 ’s in the output. 

We call this acceptance concept MAX- 1-acceptance in difference to Krentel 
[Kre88], who defined the so called MAX-acceptance. 

We define the following three satisfiability problems for boolean formulas: 
Decision Problem: MAX-TRUE-3SAT-C0MPARE 

Instance: Two 3-CNF formulas Fi and F 2 having the same number of clauses 
and variables 

Question: Is the maximum number of I’s in satisfying truth assignments for 
Fi less than or equal to that for F 2 ? 

Decision Problem: MAX-TRUE-3SAT-EQUALITY 

Instance: Two 3-CNF formulas Fi and F 2 having the same number of clauses 
and variables 
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Question: Is the maximum number of I’s in satisfying truth assignments for 
Fi equal to that for F2 ? 

Decision Problem: 0DD-MAX-TRUE-3SAT 
Instance: A 3-CNF formula F 

Question: Is the max number of I’s in satisf. truth assignments for F odd ? 

In all cases it is straightforward to prove that the problem is in O2 using binary 
search. We concentrate on proving the hardness. 

Theorem 2. MAX-TRUE-3SAT-CDMPARE and MAX-TRUE-3SAT-EQUALITY are 

complete in under polynomial time many-one reduction. 

To show the hardness we need the following lemma. 

Lemma 3. For every A G 02 there are 81,82 G P having the following prop- 
erties: 

1. (x G A — > TOi(a:) = m2{x) ) and (x ^ A — > mi{x) > m2{x) ), where 
mi{x) =df max{[r/;]i : (x,w) G i?i}^ and 

m2(x) =df max{[w]i : {x,w) G 82}. 

2. X G A — > Mi{x) = M2 {x) where 

Mi{x) =df {w : (x,w) G 81 A [w]i = mi (a;)} and 
M2 {x) =df {w : {x,w) G 82 A [w]i = m2(x)} 

3. There is a pol. p such that /\^ {{x, w) G 8, — > \w\ = p(|a;|)) (i G {1, 2}) 

Proof. We start from the characterization of 6>f-sets given in lemma 1. Let 
A G &2, M a NPTM deciding A in the sense of MAX- 1, q and p' the polynomials 
determining the length of the output and the length of the computation paths 
of M, respectively. For each n let p{n) =df p'{n) -\- 1. 

We define 81 and 82: 

B\ =df {{x,w) : x,w G S* and 

1. w has the form outM{x, P and 

2. The NPTM M has on input x on path P the output oxitM{x,P) } 

B2 =df {(x,w) : x,w G S* and 

1. w has the form out m { x, PY^^^^'^ P and 

2. The NPTM M has on input x on path P the output outM{x,P) and 

3. M accepts x on P } 

For any x let the set Bmax{x) contain all computation paths /3 of M on input 
X with the maximum number of I’s in the output, and let Bmax{x) contain all 
paths from Bmax having itself the maximum number of I’s. 

Bmax-acc{x) and B^ax-acc differ from Bmax{x) and B^ax in that only accepting 
computation paths are considered. 



^ [w]i denotes the number of I’s in w. 
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It’s not hard to see that 



Mi{x) = |outM(a:,/3')*’^''*'^/3' : f^' G Bmax{x)^ ,and (1) 

M2(x) = {outM(x,/3'r(l"l)/3':/3'GB„,,_acc(x)} . (2) 



Now we can sketch the proofs of statements 1 and 2 of the lemma. 

Let X & A. Then M accepts x on all paths [3 from Bmax(x). Hence Bmax-acc{x) = 
Bmax{x) and Bmax-acc{x) = Bmax{x). Due to (1) and (2) follow Mi{x) = M 2 {x) 
and mi{x) = m 2 {x). 

Let x ^ A. Then M rejects x on all paths (3 from Bmax{x). Let Pmax{x) G 

Bmax {x^ and Pmax — acc (x) G B^nax — acc(^)- 

Then 



OUtM(a:,/3maa;(a:)) OUtM{x,$max-acc{x)) 



Therefore 



eq.(\) 

mi(x) = 



> 



( ' \P(I“I) - 

OVit M \X,l3max{x)\ Pmax(x) 

( '' \P(I®I) - 

OntiVf ( ^5 Pmax — acc{x)) (3 max — acc \X) 



eg. ( 2 ) 

= m2(X) . 



Proof of theorem 2. Let A G 02 and Bi, B 2 the P-sets and p a polynomial 
belonging to A having the properties of lemma 3. For a given x G S* we devise 
two 3CNF-formulas Fi and F 2 such that 



Mi{x) = M 2 {x) — > max|[y]i : A(y)| = max|[y]i : F 2 {y)'^ , and 
mi{x) > m 2 {x) — > max|[y]i : A(y)| > max|[y]i : F 2 (y)| . 



Our first step is to construct a boolean circuit C with two output gates and 
of polynomial size having the property 

C*(w) = 1 < > (a:, w) e Bi (f G {1, 2}) : 



yi V2 






yp{\x\) 




y denotes a sequence (yi, 7 / 2 , • • • )• 



2 
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Using standard techniques (see e.g. [Pap94]) we build 3CNF-formulas 
■ ■ ■ :yp{\x\)jhi, . . . ,hm) and ■ ■ ■ , yp(|a:|), ■ ■ ■ ,/^m) such that 

C {yij . . . , yp(^\x\)) 1 ^ ^ \J ^i{yi-! ■ • ■ j yp(\x\)-! ■ ■ ■ 5 ^m) 

hi,... ,hjYi 

^ ^ \J ^tiyii ■ • ■ j yp{\x\)'! 5 ■ ■ ■ ? ^m) ^ { ^5 ^}) 

hi \ ! 

and 

C (yi, ■ ■ ■ 5 yp{\x\)) 1 A (7 (yi, ■ ■ ■ , yp(|a;|)) 1 

— (3) 

\J (-^l(yi?--- 1 yp(\x\)-! ^1-! ■ • ■ 1 hm) A ^ 2 (yij--- ly^dail)?^!?'-- I^m)) 

! 1 . . . • 



We define F to be equivalent to F, but with each variable yi replicated 3m times 
(by adding clauses of the form yi^i ^ yij for each j = 2 to 3m). This expansion 
is needed to pad out the number of I’s to maintain the required inequality. 
Note that r - 'i 

ma,x I [y]i : Ft{y)j = 3m ma,x{[y]i : Ci{y) = 1} + St (ie{l,2}) (4) 

for a (5i € {(J, . . . , m}. Hence 

mi{x) > m 2 {x) — > max 
and due to (3) 

Mi{x) = M 2 {x ) — > max|[y]i : A(y)| = max|[y]i : .F 2 (y)| • (5) 

□ 



[Mi : A(y) = ij > maxi[?/]i : F2{y)\ , 



Theorem 4. 0DD-MAX-TRUE-3SAT is complete in under polynomial time 
many-one reduction. 



For proving the hardness we need the following lemma. 

Lemma 5. For every A G 02 there is B gF having the following properties: 

1. x G A < — > max{[w]i : {x, w) G B} = 1(2) 

2. There is a polynomial p such that /\^ ujgi;* {{x,w) G B — >\w\ =p(|a:|)) . 

Proof. In place of Hi and B 2 defined in the proof of lemma 3 we define a single 
set B here: 

B =df {{x,w) -. x,w G S* and 

1. w has the form outM(a^, /?)^*’^'*'^/3^accM(a:, /?) and 

2. The NPTM M has on input x on path (3 the output outM(a^,/3) } 

„ / f 0 if M rejects x on f3 

&ccm{x, (3) I jf ]y[ accepts x on (3 



It’s easy to see that 
max{[y]i : {x,y) G B} 

We conclude: 



/ ^ \2p(|a:|) ^ 

OUtM \ X,f 3 tnax{x)j $ltax{x)a^CCM{x, $max{x)) 



( 6 ) 



X G j4 ' s. diCCjip (x,f3max{x)'j = 1 max{[y]i : (x,y) G B} = 1(2) . □ 

The proof of theorem 4 follows the same ideas as the proof of theorem 2, but it 
makes use of lemma 5 instead of lemma 3. 
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2.2 Sufficient Conditions for ©^"Hardness 

Wagner [Wag87] stated a sufficient condition for a set to be hard for ©f- K was 
applied subsequently in a number of papers to prove the ©^“hardness of various 
problems. Lemma 5 from section 2.1 implies immediately another sufficient con- 
dition for © 2 -hardness which is given below in lemma 6. We show that Wagner’s 
statement follows easily from our’s. Thus we have evidence that lemma 6 is at 
least as strong as Wagner’s condition. 

Lemma 6. A set ACE* is O^-hard if the following property holds: 

(**) A /\ y /\ (max{[y]i : ll/l =P(I*I) A € 5} = 1(2) « ^ g{x) G A) 

BGP p£Pol gGFP xGU* 

□ 

Lemma 7. [Wagner 1987]^ A set A C E* is Olf-hard if the following property 
holds: 

(*) There exists a polynomial-time computable function f and an NF-complete 
set D such that 



||{i : Xt e D}\\ = 1(2) < > f{xi,. . . ,X 2 k) G A 

for all k > 1 and all strings x\,. . . , X 2 k G AJ* satisfying 

Xd{xi) >xd{x 2) > ■■■>XD{x2k)-'^ □ 



Let A be an arbitrary set satisfying condition (*). We show that A satisfies (**) 
as well. 

Let B gP and p G Pol. 

We define / 

E=dfl{x,y): y {[y]i> y A {x,y') G B) 

[ y' ,\y'\=p{\A) 

For a given x G E* we have to distinguish two cases. 

We prove here case 1: p{\x\) is odd. 

We set 

xi = {x,0),X2 = (a;, l),a :3 = {x,2),... ,Xp(\^\)+i = (a;,p(|a;|)) . 

Thus 

Xe{xi) > xe{x2) > ■ ■ ■> XE(a;p(p|)-K) 

and E G NP. 

Let ft- be a polynomial-time reduction from E to the NP-complete set D. Then 




||{z : Xi G S}|| = 1(2) ^ ||{z : h{x.) G D}\\ = 1(2) 

< > f{h{xi), , ft(xp(|a;|)+i) G A 

^ ^ 9 {xi, .. . , a;p(|a;|)_|_i) G A 

® Wagner states hardness for P^/, a class which is now known to be equal to ©I- 
^ Xd denotes the characteristic function of D. 
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for g' being the composition of h and /. Hence there is a g G FP such that 

||{t : Xi G E}\\ = 1(2) < > g{x) G A . 

To complete the proof note that 

\\{i: X, G E}\\ =max{[y]i -.\y\=p{\x\) A{x,y) G B} . ^ 

If we apply the same reasoning steps to lemma 3 instead of lemma 5, we will get 
the following two lemmas. 

Lemma 8. A set A C E* is 02 -hard if the following property holds: 
t\ f\ \! f\ {^^^{[y\i-\y\=p{\x\) ^ Bi\ 

Si,B26P pGPol geFP x&S* 

< max{[y]i : |y| =p(|a:|) A {x,y) G B 2 } < — > g{x) G A) □ 



Lemma 9. A set A C S* is O^-hard if the following property holds: There exist 
a polynomial-time computable function f and two NF -complete sets D\ and D 2 
such that 



||{t : Xi G Di}\\ < ||{z : x^ G £> 2 }|| < > f{xi, ■ ■ ■ ,X 2 k) G A 

for all k > 1 and all strings x\,. . . , X 2 k G E* satisfying 
XDi {xi)> XDi (X 2 ) > ■■■> XDi (x 2 k ) and 

XD^ixi) > XD2 (x 2) > ■ ■■ > XD2(x2k) □ 

This lemma is in style similar to Wagner’s tool. We have a comparison in place 
of oddness. 

2.3 Remark: We Can Use 2SAT Instead of 3SAT 

What about the computational complexity of the “2SAT versions” for our com- 
parison and oddness problems ? 

It is well known that the satisfiability problem for 2CNF formulas is solvable 
in polynomial time. Using binary search it is possible to find the lexicographi- 
cally largest satisfying assignment of a 2CNF formula in polynomial time. Hence 
MAX-LEX-2SAT-C0MPARE and MAX-LEX-2SAT-EQUALITY® are in P. 

In contrast to this we state here without proof: 

Theorem 10. MAX-TRUE-2SAT-CDMPARE, MAX-TRUE-2SAT-EQUALITY, and 
0DD-MAX-TRUE-2SAT are complete in 6*2 under polynomial time many-one 
reduction. 

Nevertheless, the reductions presented in the remaining part of the paper are 
from the 3SAT versions since the 2SAT versions don’t make the proofs easier. 

® The definitions of the 3SAT versions are given in section 3.2. 
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3 Applications of the Method 

3.1 02"Complete Combinatorial Problems 

We define the following problem: 

Decision Problem: min-PolynomiallyWeighted-vertex cover compare 

Instance: Two graphs Gi = and G 2 = {V 2 , £ 2 , 102 )- Each v & Vi 

is assigned a weight Wi{v) e [0, ||Vi||] (i € {0, 1}). 

Question: We define for each subset V' C V the weight Wi(V') = Sv^v'Wi{v). 
Let mwvc(Gi) be the weight of the vertex cover of Gt having minimum weight. 
Holds mwvc{Gi) < mwvc{G 2 ) ? 

Theorem 11. There is a polynomial-time many-one reduction from 

MAX-TRUE-3SAT-C0MPARE to min-PolynomiallyWeighted-vertex cover 
compare. Hence min-PolynomiallyWeighted-vertex cover compare is Olf- 
complete. □ 

Proof. Our proof is based on the reduction from 3SAT to vertex cover given in 
[GJ79]. We will say how the construction there is modified to obtain our result. 
Assume that we are given Ci and C 2 both having n variables and m clauses. 
Our construction is accomplished in two steps: 

1. Construct Gi = {Vi, Ei,wi) from Gi and G 2 = (V 2 , £ 2 , 002 ) from G 2 as in 
the reduction from 3SAT to vertex cover. Note that ||yi|| = ||V 2 ||- 

2. Assign weights to the vertices of Gfc (fc G {1, 2}): 

Wk(ai[j]) = Wk{a 2 [j]) = Wkiaslf]) = 2m + n + 1 (0 < j < to) 

Wk{ui) = 2m + n + 1 

Wk{ui) = 2m + n + 2 (0<f<n) 

The weights of the vertices are chosen such that for each vertex cover V( C Vi 
of Gi holds: 

— If II y/ 1 1 = 2m-\-n then Vf has weight of no more than (2TO + n)(2TO + n + 2). 

— If Ills'll > 2m + n then Vf has weight of at least (2to + n + 1)(2to + n + 1) = 
(2 to + n){2m + n + 2) + 1. 

Hence the vertex covers of Gi with minimum weight are among the vertex covers 
of Gi with cardinality 2m-\-n. As discussed in the classical proof of [GJ79], each 
vertex cover of cardinality 2m+n defines a satisfying truth assignment. It remains 
to verify the following two assertions: 

1. To each satisfying truth assignment t for U with ||{z : t{ui) = 0}|| = s belongs 
a vertex cover with weight (2TO+n+l)2TO+(2TO+n+l)(n— s) + (2TO+n+2)s. 

2. To each vertex cover with weight (2to + n + 1)2to + (2to + n + l)(n — s) + 

(2to + n + 2)s and 0 < s < n belongs a satisfying truth assignment for U 
with ||{i : t{ui) = 0}|| = s. □ 
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Decision Problem: min-card-vertex cover compare 
Instance: Two graphs Gi = {y\,Ei) and G2 = (V2,i?2)- 

Question: Let Hi =df min{||y'|| : V' C Vi and V' is a vertex cover of Gi} 
(i G { 1 , 2 }). Holds < K2 ? 

Theorem 12. There is a polynomial-time many-one reduction from 

min-PolynomiallyWeighted-vertex cover compare to min-card-vertex 
cover compare. Hence min-card-vertex cover compare is O^-complete. 

Proof. Let G = {V, E, w) be an arbitrary polynomially vertex-weighted graph as 
occurring in instances of min-PolynomiallyWeighted-vertex cover compare. 
We obtain a graph G' = {V ,E') such that 

mwvc(G) = min{||T"|| : V” C V' and V” is a cover of G'} 
by defining 

=df |(u, 1 ),... ,{u,w{u)) :ueV] 

and 

=df ■ {u,v} e E Al <i < w{u) A 1 < j < w(u)} . 

The transformation from an instance (Gi = (Vi, Gi, wi), G2 = {V2,E2,W2)) 
of min-PolynomiallyWeighted-vertex cover compare to min-card-vertex 
cover compare is accomplished by applying this construction to G\ and G2. □ 

Note that min-card-vertex cover compare remains Gf'Complete for instances 
satisfying ||Tl|| = ||y2||- 

Of course, theorem 12 can be stated in terms of INDEPENDENT SET and CLIQUE 
as well. 

Theorem 13. max-card-independent set compare and max-card-clique 
compare are O^-complete. 



3.2 Transcription to A.^: TSPcompare and TSPequality 

We transcribe the technique used in subsection 2.1 to Al} = 

Decision Problem: MAX-LEX-3SAT-C0MPARE 

Instance: Two 3 -CNF formulas Ei and F2 having the same number of clauses 
and variables 

Question: Is the lexicographic maximum satisfying truth assignment for Ei less 
than or equal to that for E2 ? 

In appropriate manner we define MAX-LEX-3SAT-EQUALITY. 

Theorem 14. MAX-LEX-3SAT-CDMPARE and MAX-LEX-3SAT-EQUALITY are com- 
plete in A2 under polynomial time many-one reduction. 

For proving the hardness we need the following lemma. 

Lemma 15. For every A G there are Bi,B2 G P having the following prop- 
erties: 
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1. X G A — > max {w : {x, w) G Bi} = max {w : {x, w) G B 2 } 

X ^ A — > max {w : {x, w) G Bi} > max {w : (x, w) G B 2 } 

2. There is a pol. p such that /\^ {{x, w) G B, — > \w\ = p(|a;|)) (i G {1, 2}) 

Proof. We start from the characterization of Z\f-sets given by Krentel [Kre88] 
and define B\ and B 2 as in the proof of lemma 3 where the strings 
are substituted by “outM(a^, /?)/?” • □ 

In order to complete the proof of theorem 14 the desired 3CNF formulas Fi and 
F 2 can be constr. from B\ and B 2 following the ideas of the proof of theorem 2. 
We are now able to prove that given two instances of traveling-salesperson, it is 
Zif-complete to decide if the optimal tour length for the first TSP instance is 
not longer than that for the second. 

We assume the reader to be familiar with the NP-completeness proof for the 
Hamilton path problem given by Machtey/ Young [MJ78], p. 244ff. 

We define the following problems TSPcompare and TSPequality. 

Decision Problem: TSPcompare 

Instance: Two matrices (Tff) and {Mfj'j each consisting of nonnegative 
integer “distances” between s cities 

Question: Let tk be the length of the optimal tour for Mfj {k G {1, 2}). 

Holds ti <t 2 ? 

Analogously TSPequality is defined. 

Theorem 16. TSPcompare and TSPequality are A^-complete. 

Proof. For intermediate steps we need the problems weighted directed 
Hcunilton circuit compare/equality and weighted undirected Hamilton 
circuit compare/equality, which are defined in an obvious way. 

Our reduction chain looks as follows: 

MAX-LEX-3SAT-C0MPARE/EQUAL1TY weighted directed Hamilton cir- 

cuit compare/equality weighted undirected Hamilton circuit com- 
pare/equality TSPcompare/equality. 

For the reduction from MAX-LEX-3SAT-C0MPARE/EQUAL1TY to weighted 
directed Haunilton circuit compare/equality consider the reduction from 
3SAT to the Hamilton path problem given in [MJ78]. Identifying the vertices 
Vn+i and vi we get a reduction from 3SAT to directed Hamilton circuit. 
Given two 3CNF formulas Fi and F 2 let Gi / G 2 be the directed graphs belonging 
to F 2 /F 1 according to the construction in the proof there. For each variable Xi 
there is a corresponding vertex vi (in both graphs). Each such vertex Vi has two 
outgoing arcs: the “upper” arc corresponding to occurrences of Xi, the “lower” 
arc corresponding to occurences of ^Xi. We set the weight of the lower arc of 
each vertex Vi to 2”“*. All other arcs get weight 0. 

The reduction from weighted directed Hamilton circuit compare/equal- 
ity to weighted undirected Hamilton circuit compare/equality is ac- 
complished by expanding each node into a trio of nodes: for incoming edges, 
outgoing edges, and a middle node to force us to go from in to out [MJ78]. 
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For the reduction from weighted undirected Hamilton circuit compare/e- 
quality to TSPcompare/equality 
we set M^. = f if {^J} ^ 

V df 2 " otherwise 

for undirected weighted graphs Gi = (Vi,Ei,wi) and G 2 = {V 2 , £ 2 , 102 )- □ 

3.3 Voting Schemes 

A lot of different voting systems are extensively treated in the social choice 
literature. For an overview of that field consult e.g. [Fish77]. 

Bartholdi, Tovey, and Trick [BTT89] initiated the investigation of the computa- 
tional complexity of voting systems: 

For the Dodgson as well as the Kemeny voting systems they proved that it is 
NP-hard to determine if a given candidate is a winner of an election under that 
voting scheme. 

Hemaspaandra/Hemaspaandra/Rothe [HHR97] improve this lower bound for the 
Dodgson voting scheme by proving its -hardness. Together with the almost 
trivial ©2 upper bound (holding for both problems) this allows an exact classi- 
fication of the complexity of Dodgson voting. 

A similar result for the Kemeny voting scheme has been unknown for a long 
time. Applying theorem 2 we are not only able to settle this question showing 
the completeness of the Kemeny voting scheme in 02, but also to simplify the 
proof given in [HHR97] considerably. 

Hemaspaandra/Hemaspaandra [HHOO] stated in an invited talk of the MFCS 
2000 conference the result for Kemeny’s voting scheme presented in a paper by 
E. Hemaspaandra [HemOO]. 

An analysis of Dodgson’s and Kemeny’s voting systems reveals that the winner- 
problems entail in fact a comparison between to instances of optimization prob- 
lems: 

Dodgson’s voting scheme: Compare the Dodgson score of a candidate ci 
with the one of a second candidate C 2 . 

The Dodgson score of a candidate is the minimum number of switches in 
the voters’ preference orders such that this candidate becomes a Condorcet 
winner. See [Con85]. 

Kemeny’s voting scheme: Compare the Kemeny score of a candidate ci with 
the one of a second candidate C 2 . 

The Kemeny score of a candidate c is the sum of the distances of a preference 
order P to the preferences of the voters, where P is a preference order with c 
in first place minimizing this sum. For the definition of the distance measure 
see [BTT89]. 

Thus both problems are similar in nature to MAX-TRUE-3SAT-C0MPARE and 
min-card-vertex cover compare. 

A complete version of this topic is given in [SVOO]. 

Acknowledgement We want to thank the anonymous referees and J. Rothe 
for their valuable hints and suggestions. 
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Abstract. We investigate the question whether there is a (p-)optimal 
proof system for SAT or for TAUT and its relation to completeness 
and collapse results for nondeterministic function classes. A p-optimal 
proof system for SAT is shown to imply (1) that there exists a complete 
function for the class of all total nondeterministic multi-valued func- 
tions and (2) that any set with an optimal proof system has a p-optimal 
proof system. By replacing the assumption of the mere existence of a (p- 
)optimal proof system by the assumption that certain proof systems are 
(p-)optimal we obtain stronger consequences, namely collapse results for 
various function classes. Especially we investigate the question whether 
the standard proof system for SAT is p-optimal. We show that this as- 
sumption is equivalent to a variety of complexity theoretical assertions 
studied before, and to the assumption that every optimal proof system 
is p-optimal. Finally, we investigate whether there is an optimal proof 
system for TAUT that admits an effective interpolation, and show some 
relations between various completeness assumptions. 



1 Introduction and Overview 

Following Cook and Reckhow [3] we define the notion of an abstract proof system 
for a set L C {0, 1}* as follows. A (possibly partial) polynomial-time computable 
function h : {0, 1}* ^ {0, 1}* with range L = {h{x) | a; G {0, 1}*} is called a 
proof system for L. In this setting, an h-proof for the membership of tp to L is 
given by a string w with h(w) = tp. In order to compare the relative strength 
of different proof systems for the set TAUT of all propositional tautologies, 
Cook and Reckhow introduced the notion of p-simulation. A proof system h 
p-simulates a proof system g if g-proofs can be translated into h-proofs in poly- 
nomial time, i.e., there is a polynomial-time computable function / such that for 
each V in the domain of g, h{f(v)) = g(v). Similarly, h is said to simulate g if for 
each g-proof v there is an /i-proof w of length polynomial in the length of v with 
h{w) = g{v). A proof system for a set L is called (p-)optimal if it (p-)simulates 
every proof system for L (cf. [12]). It’s a natural question whether a set L has 
a p-optimal (or at least an optimal) proof system. Note that a p-optimal proof 
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system has the advantage that from any proof in an other proof system one can 
efficiently obtain a proof for the same instance in the p-optimal proof system. 
Hence, any method that is used to compute proofs in some proof system can be 
reformulated to yield proofs in the p-optimal proof system with little overhead. 

It is observed in [19,16,11] that (p-)optimal proof systems for certain lan- 
guages can be used to define complete sets for certain promise classes. For ex- 
ample, if TAUT has an optimal proof system then MV H Sparse has a many-one 
complete set, and if TAUT as well as SAT have a p-optimal proof system then 
MV n co-NV has a complete set. We complete this picture here by showing that 
already a p-optimal proof system for SAT can be used to derive completeness 
consequences. 

These results are however unsatisfactory in so far as they provide only neces- 
sary conditions for the existence of (p-)optimal proof systems. It appears that a 
much stronger assumption like MV = V\s needed to derive a p-optimal proof sys- 
tem for TAUT (actually, a somewhat weaker collapse condition suffices, namely 
that all tally sets in nondeterministic double exponential time are contained in 
deterministic double exponential time; see [11]). If, however, we consider proof 
systems with certain additional properties then we can indeed derive collapse 
consequences from the assumption that these proof systems are (p-)optimal. We 
consider two examples: 

The best known proof system is probably the standard proof system for SAT 
where proofs are given by a satisfying assignment for the formula in question. 
Adapting this proof method to the current setting one obtains the following 
natural proof system for SAT: 

i-( \ — j a X = {a,(p) and a is a satisfying assignment for tp 

undef. otherwise. 

We consider the question whether sat is p-optimab in Sect. 3. It turns out that 
the assumption of sat being p-optimal is equivalent to a variety of well studied 
complexity theoretic assumptions (which have unlikely collapse consequences 
as, e.g., that AfV n co-NV = V). Most of these assumptions were listed in [5] 
under “Proposition Q” (see also [6]). Proposition Q states for example that any 
function in the class AfVAiVt of total multi-valued functions computable in 
nondeterministic polynomial time has a refinement in tFV. We further add to 
this list the statement that every optimal proof system is p-optimal. 

As a second example, we consider in Sect. 4 proof systems that admit an 
effective interpolation. Due to Craig’s Interpolation Theorem for Propositional 
Logic, for any tautology ip —> ip there is a formula (p that uses only common 
variables of p and ip such that p ^ (p and (p —>■ ip. A circuit C that computes 
the same function as (p is called an interpolant of p —>■ ip. Following [13] we 
say that a proof system h for TAUT admits an effective interpolation if there 
is a polynomial p such that for any ft,-proof w of a formula h{w) = p ^ ip, the 
formula p ^ ip has an interpolant of size at most p(|w|). We show that if TAUT 

^ Pavel Pudlak posed this question during the discnssion after Zenon Sadowski’s talk 
at CSL’98 [20]. 
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has an optimal proof system with this property then any function in AfVSV (the 
class of single valued functions computable in nondeterministic polynomial time) 
has a total extension in TV /poly. The latter is equivalent to the statement that 
every disjoint pair of MV-sets is 7^/poZt/-separable which in turn implies that 
NV n co-NV C V/ poly and that UV C V/poly. 

The (likely) assumption that there are no p-optimal proof systems for SAT (as 
well as for TAUT) also has some practical implications due to its connection to 
the existence of optimal algorithms (see [12,20,15]). Note that usually a decision 
algorithm for SAT also provides a satisfying assignment for any positive instance. 
However, if sat is not p-optimal then there is a set S' C SAT of easy instances 
(i.e. S G V) for some of which it is hard to produce a satisfying assignment (i.e., 
there is no polynomial time algorithm that produces a satisfying assignment on 
all inputs from S, cf. Theorem 1). In fact, a stronger consequence can be derived: 
if sat is not p-optimal then there is a non-sparse set of easy instances from SAT 
for which it is hard to produce a satisfying assignment (see Theorem 5) . 

The observations from [19,16,11] that a p-optimal proof system for a set L 
implies the existence of a complete set for a certain promise class in fact shows 
a relationship between different completeness assumptions. Since the definition 
of p-simulation is equivalent to the definition of many-one reducibility between 
functions (in the sense of [11]), a proof system for L is p-optimal if and only if 
it is many-one complete for the (promise) function class VSl that consists of 
all proof systems for L. Depending on the complexity of L, this completeness 
assumption can be used to derive the existence of complete sets for various other 
promise classes. This observation motivates us to further investigate whether 
there are relations between various completeness assumptions. Along this line of 
investigation we show in Sect. 5 that 

— AfVSV has a many-one complete function if and only if there is a strongly 
many-one complete disjoint AfT^-pair, 

— a complete function for AfVAAVt implies a many-one complete pair for the 
class of disjoint co-AfV pairs, and 

— AfVSV t = AfVA4VtV\ AfVSV has a many-one complete function if and only 
if AfV n co-AfV has a complete set. 

The collapse consequences for the nondeterministic function classes NVAAVt 
and AfVSV that are obtained in Sects 3, 4 from the assumption that sat is p- 
optimal, respectively that there is an optimal proof system for TAUT that ad- 
mits an effective interpolation, are complemented by the following completeness 
consequences (presented in Sect. 6): 

— If SAT has a p-optimal proof system then AfVAAVt has a complete function. 

— If TAUT has an optimal proof system then AfVSV has a complete function. 

Further we show that 

— SAT has a p-optimal proof system if and only if any language with an optimal 
proof system also has a p-optimal proof system. 
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This result again complements the observation from Sect. 3 that sat is p-optimal 
if and only if every optimal proof system is p-optimal. As an application we can 
weaken the assumption used in [1 1] to show that NV n co-NV has a complete 
set: it suffices to assume that SAT has a p-optimal proof system and TAUT has 
an optimal proof system. 

Due to the limited space, several results are presented here without proof. 



2 Preliminaries 

Let S = {0, 1}. We denote the cardinality of a set A by ||A|| and the length of 
a string x & S* hy |a;|. A set S is called sparse if the cardinality of S' n A” is 
bounded above by a polynomial in n. TV denotes the class of (partial) functions 
that can be computed in polynomial time. We use (•,•••,•) to denote a standard 
polynomial-time computable tupling function. The definitions of standard com- 
plexity classes like V, NV, etc. can be found in books like [1,17]. For a class C 
of sets we call a pair {A,B) of disjoint sets A,B&Ca, disjoint C-pair. If for a 
class V, and some D gV it holds A C D, and BHD = ib we call the pair (A, B) 
T>-separable. 

A nondeterministic polynomial time Turing machine {NPTM, for short) is a 
Turing machine N such that for some polynomial p, every accepting path of N 
on any input of length n is at most of length p{n). A nondeterministic transducer 
is a nondeterministic Turing machine T with a write-only output tape. On input 
X, T outputs y G E* (in symbols: T{x) i-^- y) if there is an accepting path on 
input X along which y is written on the output tape. Hence, the function / 
computed by T on S* could be multi-valued and partial. Using the notation 
of [2,23] we denote the set {y \ f{x) y} of all output values of T on input 
X by set-f{x). AfVMV denotes the class of all multi-valued, partial functions 
computable by some nondeterministic polynomial-time transducer. AfVSV is the 
class of functions / in AfVMV that are single-valued, i.e. jjset-/(a:)jj < 1. (thus, 
a single-valued multi-valued function is a function in the usual sense, and we 
use f{x) to denote the unique string in set-f{x)). The domain of a multi-valued 
function is the set of those inputs x where set-/(a:) yf 0. A function is called total 
if its domain is E* . For a function class T we denote by Tt the class of total 
functions in T . We use AfVMVt Qc TV to indicate that for any g G AfVMVt 
there is a total function / G TV that is a refinement of g, i.e. f{x) G set-g{x) 
for all X G E*. We say that a multi-valued function h many-one reduces to a 
multi-valued function g if there is a function / G TV such that for every x G E* 
set-g{f{x)) = set-h{x). 

For a function class T a function h is called T -invertible if there is a function 
f G T that inverts h, i.e. h{f{y)) = y for each y in the range of h. A function h 
is honest if for some polynomial p, p{\h{x)\) > ja;j holds for all x in the domain 
of h. We call a function g an extension of a function / if f{x) = g{x) for any x 
in the domain of /. A function r : IN IN is called super-polynomial if for each 
polynomial p, r{n) > p(n) for almost every n > 0. A set B G V with B C L is 
called a V -subset of L. 
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3 Q and the P-Optimality of sat 

In [5] the following statements were all shown to be equivalent. There, Q is de- 
fined to be the proposition that one (and consequently each) of these statements 
is true. In this section we show that Q is also equivalent to the p-optimality of 
sat. 

Theorem 1 ([5], cf. [6]). The following statements are equivalent 

1. For each NPTM N that accepts SAT there is a function f G TV such that 
for each a encoding an accepting path of N on input tp, f{a) is a satisfying 
assignment of p. 

2. Each honest function f G TV with range S* is TV -invertible. 

3. NVMVt QcTV. 

4-. For all V-suhsets S of SAT there exists a function g G TV such that for all 
p G S, g{p) is a satisfying assignment of p. 

Clearly, each nondeterministic Turing machine N corresponds to a proof 
system h with h{a) = p \i a encodes an accepting path of N on input p. Now 
h is honest if, and only if, IV is a NPTM. This leads to the observation that 
Statement 1 in Theorem 1 is equivalent to the condition that sat p-simulates 
every honest proof system for SAT. Hence, we just need to delete the term 
‘polynomial-time’ in the Statement 1 of Theorem 1 to obtain the desired result 
that Q is equivalent to the p-optimality of sat. That this is possible without 
changing the truth of the theorem can be shown by a padding argument. 

Theorem 2. The following statements are equivalent. 

1. For each NPTM N that accepts SAT there is a function f G TV such that 
for each a encoding an accepting path of N on input p, f{a) is a satisfying 
assignment of p. 

2. For each nondeterministic Turing machine N that accepts SAT there is a 
function f G TV such that for each a encoding an accepting path of N on 
input p, f{a) is a satisfying assignment of p. 

3. sat is a p- optimal proof system for SAT. 

It is known that the assumption JVV = V implies NVTiVt Qc TV which 
in turn implies NV H co-AfV = V (cf. [24]). Also, in [9] it has been shown that 
the converse of these implications is not true in suitable relativized worlds. The 
consequence AfV C co-NV — V also shows that the assumption that sat is p- 
optimal is presumably stronger than the assumption that SAT has a p-optimal 
proof system. Namely the p-optimality of sat implies that AfV n co-NV = V, 
whereas the existence of a p-optimal proof system follows already (see [11]) if 
any super-tally set in E 2 belongs to V (here, any set L C {0^^ | n > 0} is 

called super-tally). 

The assumption that sat is a p-optimal proof system also has an effect on 
various reducibility degrees, as has been mentioned in [5] for Karp and Levin 
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reducibility. Also in [14] it is shown that AfVAiVt Qc if and only if 7 - 
reducibility equals polynomial time many-one reducibility. Furthermore it is 
shown in [4] that Statement 4 of Theorem 1 is equivalent to the assumption 
that the approximation class APX is closed under L-reducibility (see [4] for 
definitions) . 

The equivalence between the p-optimality of sat and NVMVt tFV di- 
rectly leads to a proof of the following theorem. 

Theorem 3. The following statements are equivalent. 

1. sat is p- optimal. 

2. For any language L and all proof systems h and g for L 

h p-simulates g if and only if h simulates g 
(i.e., the corresponding quasi-orders coincide). 

3. Every optimal proof system is p-optimal. 

In [15] it is observed that given a p-optimal proof system h for a language 
L the problem to find an /i-proof for y G L is not much harder than deciding 
L, i.e. we can transform each deterministic Turing machine M with L{M) = L 
to a deterministic Turing machine M' that on input y & L yields an h-proof of 
y in timeM'(y) < p{\u\ + for some polynomial p determined by M. 

Using this observation and the equivalences in Theorem 1 and 2 we obtain the 
following result: sat is p-optimal if and only if any deterministic Turing machine 
M that accepts SAT can be converted to a deterministic Turing machine that 
computes a satisfying assignment for any formula (p S SAT and runs not much 
longer than M on input p. 

Theorem 4. The following statements are equivalent. 

1. sat is p-optimal. 

2. For any deterministic Turing machine M that accepts SAT, there is a deter- 
ministic Turing machine M' and a polynomial p such that for every (p G SAT, 
M' produces a satisfying assignment of p in timeM'{p) < p{\t\ + timcMip)) 
steps. 

Under the assumption that sat is not p-optimal it follows from Theorem 4 
that there is a Turing machine M that decides SAT such that any machine M' 
that on input p G SAT has to produce a satisfying assignment for p is much 
slower on some SAT instances. In some sense this appears counterintuitive as 
probably all SAT algorithms used in praxis produce a satisfying assignment in 
case the input belongs to SAT. Of course it follows from Theorem 4 that M is 
superior to any such M' on an infinite set of instances. As shown in the following 
theorem there is a deterministic Turing machine M accepting SAT that is more 
than polynomially faster than any deterministic transducer M' that produces 
satisfying assignments on a fixed non-sparse set of SAT instances. The result is 
due to the paddability of SAT, its proof uses ideas from the theory of complexity 
cores (see [ 21 ]). 
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Theorem 5. The following statements are equivalent. 

1. sat is not p- optimal. 

2. There is a V-subset S of SAT (i.e. there is a Turing machine M accepting 
SAT that has a polynomial time-bound on instances from S), a non-sparse 
subset L of S, and a super-polynomial function f such that for any determin- 
istic Turing machine M' that on input of any (p G L produces a satisfying 
assignment of p it holds timcM'ip) > f{\T\) for almost every p G L. 

4 Collapse of Af'PSV and Effective Interpolation 

In [22] the following hypothesis (called H2) has been examined. 

For every polynomial time uniform family of formulas {pn,'ffn} such 
that for every n, pn and ifn have n common variables and pn ipn is 
a tautology, there is a polynomial p and a circuit family {Cn} where for 
each n, Cn is of size at most p{n) has n inputs and is an interpolant of 

Pn i>n- 



Theorem 6 ([22]). The following statements are equivalent. 

1. H2. 

2. Every disjoint pair of NV -sets is V / poly -separable. 

3. Every function in AfVSV has a total extension in TV /poly . 

As mentioned in [22] this in turn implies that H2 implies NV H co-AfV C V /poly 
and UV C V /poly. 

In [5] and [6] also a statement called Q' that is implied by Q is examined. 
Q' is equivalent to the statement that all disjoint co-AfT^-pairs are 7^-separable. 
Thus as proposed in [22] H2 is a nonuniform version of the dual condition to Q' , 
namely that every disjoint pair of AfT^-sets is 7^-separable. 

It is observed in [13] that extended Frege proof systems do not admit an ef- 
fective interpolation if the RSA cryptosystem is secure. Partly generalizing this 
observation, one can state that the existence of an honest injective function in 
TV that is not TV / poly-imrertihle (i.e., a one-way function that is secure against 
TV /poly) implies the existence of a proof system for TAUT that does not admit 
an effective interpolation. Notice that each injective function in TV is invert- 
ible by a A/”7^5V-function. Thus the assumption that each AfVSV function has 
a total extension in TV /poly implies that every injective function is TV /poly- 
invertible. As the former assumption (that is equivalent to H2 by Theorem 6) 
implies NV n co-NV C V/poly and the latter is equivalent to UV C V /poly 
(cf. [10,7]) it is presumably stronger. In the following Theorem we observe that 
H2, respectively the statement that every function in MVSV has a total exten- 
sion in TV /poly is true if, and only if, every proof system for TAUT admits an 
effective interpolation. 
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Theorem 7. The following statements are equivalent. 

1. Every function in AfVSV has a total extension in TV /poly . 

2. Every proof system for TAUT admits an effective interpolation. 

3. For any set S C TAUT, S G NV, there is a polynomial p, such that any 
formula tp ^ G S has an interpolant of size at most p{\p 



Proof. The implication 2 H2 is easy to see, as for every polynomial time 
uniform family of tautologies tpn 4’n one may define a proof system h for 
TAUT that has a short proof for any tautology of this family. Thus 2 1 

using Theorem 6. 

The proof of the implication 1 3 is obtained by extending an idea from 

[22] that was used to prove the implication 3 1 of Theorem 6. Let S C 

TAUT, S G NV. Let / be a function such that for any formula ip G S, ip = 
po{x,y) pi{x,z) (where x,y,z denote vectors of variables), it holds 






J 1 if for some /J, po{a,/3) holds 

( 0 if for some 7, holds. 



Otherwise, and for any other input let / be undefined. First observe that / is well 
defined, i.e. that / is single valued. This is due to the fact that p = po{x,y) 
pi{x,z) G TAUT. Further, / can be computed by a nondeterministic machine 
N that first (in deterministic polynomial time) validates that the input is of the 
appropriate form (a,p), p = po{x,y) pi{x,z). Then N guesses a certificate 
for G S' and, if successful, guesses some string w. Now if w is of an appropriate 
length and if po{a,w) holds then N outputs 1, if pi{a,w) holds, N outputs 
0. Hence / G AfVSV. Assuming 1, f has a total extension in TV /poly. Thus 
there is a polynomial p and for any n > 0 a circuit C„ of size at most p{n) such 
that for any tuple v = (a,p) of length n in the domain of /, C„(f) = f{v). 
Fixing the input bits of C„ that belong to the formula p we obtain a circuit C,p 
with Cip(a) = Cn{{ot,p)) = f{{a,p)) and thus C,p is of size polynomial in \p\. 
Now observe that C,^ is an interpolant for the formulas po{x, y) and pi{x, z). If 
po{a,y) is satisfiable then C,p{a) = 1, and if C,p{a) = 1 then for no 7 it holds 
^(^1(0,7) and therefore pi{a,z) is a tautology. 

Finally the proof of the implication 3 2 is obtained using padding tech- 
niques. We omit the details due to the limited space. □ 



It is easy to see that a proof system g admits an effective interpolation if there 
is a proof system h that simulates g, and h admits an effective interpolation. 
Hence, any proof system for TAUT admits an effective interpolation if there is 
an optimal proof system for TAUT that admits an effective interpolation. As a 
corollary we obtain 



Corollary 8. If there is an optimal proof system for TAUT that admits an 
effective interpolation then H2 holds. 
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5 Relations between Completeness Assumptions 

Using the characterization A/’7^5Vt = jr-p^Tnco-AfT [23,8] we obtain the follow- 
ing result 

Theorem 9. NV^co-NV has a many-one complete set iff AfVSVt has a many- 
one complete function. 

Now let us consider the function class MVSV. In the same way as NVSVt 
corresponds to the language class NVC] co-NV, the function class NVSV corre- 
sponds to the class of all disjoint A/”7^-pairs. In fact, if we denote the class of all 
0,1-valued functions in NVSV by NVSV then any function h G NVSV 
can be identified with the AfT^-pair (Aq, Ai) where A), = {x & S* \ h{x) ^ h}. 

Razborov [18] introduced a notion of many-one reducibility between disjoint 
AfT^-pairs a stronger version of which was studied in [11]. Let Aq, Ai, Bq, Bi 
be AfV-sets with Aq C\ A\ = Bq C\ B\ = 0. The pair (Aq,Ai) (strongly) many- 
one reduces to (Bo,Bi) if there is a function / G iFV such that f{Ab) C Bh 
{f~^{Bb) = Af,, respectively) for b G {0, 1}. Actually, it is easy to see that / is 
a many-one reduction between two functions in A/”7^5V{o,i} if and only if / is a 
strong many-one reduction between the corresponding disjoint AfT^-pairs. Thus, 
the class of disjoint A/”7^-pairs has a strongly many-one complete pair if and only 
if A/”7^5V{o,i} has a many-one complete function. As shown in the next theorem, 
this is even equivalent to the assumption that AfVSV has a many-one complete 
function. 

Theorem 10. The following statements are equivalent. 

1. AfVSV has a many-one complete function. 

2. AfVSV { 0 ^ 1 } has a many-one complete function. 

3. There is a strongly many-one complete disjoint NV -pair. 

Proof. (Sketch). Implication 1 2 is easy to prove, and the equivalence of 2 

and 3 is clear by the preceding discussion. To see that 2 implies 1 we observe that 
AfVSV can be characterized as jr'pS'vsv^Q^^] where the value M^{x) computed 
by the deterministic oracle transducer M on input x is only defined if all oracle 
queries belong to the domain of the functional oracle /. □ 

We conclude this section by observing that the class of disjoint co-AfV pairs 
corresponds to the class AfVbVt of all 0,1-valued functions in AfVMVt studied 
in [5] (with the disjoint co-AfV-p&ir {Aq,Ax) associate the function h G AfVbVt 
defined by set-h{x) = {b \ x jh Ab}). Similar to the implication 1 3 in 

Theorem 10 the following theorem can be proved. 

Theorem 11. If NVAAVt has a many-one complete function then there exists 
a strongly many-one complete disjoint co-NV-pair. 

We leave it open whether the reverse implication also holds. 
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6 Existence of (P-) Optimal Proof Systems 

In Theorem 3 it is observed that sat is p-optimal iff every optimal proof system 
is p-optimal. Although the assumption of the mere existence of a p-optimal proof 
system for SAT is presumably weaker than the assumption that sat is p-optimal, 
it is still equivalent to a quite similar statement, namely that any set with an 
optimal proof system has a p-optimal proof system. For the proof of this result 
we use the following observation from [11]. 

Lemma 12 ([11]). If L has a (p-)optimal proof system, and T <'^ L then T 
has a (p-) optimal proof system (respectively). 



Theorem 13. The following statements are equivalent. 

1. SAT has a p-optimal proof system. 

2. Any language L that has an optimal proof system also has a p-optimal proof 
system. 

Proof. Clearly 2 1, as SAT has an optimal proof system. To see the inverse 

implication assume that SAT has a p-optimal proof system. Let Tl (cf. [11,15]) be 
the following language consisting of tuples (M, x, 0®) where M is a deterministic 
Turing transducer, s > 0 and x S S*. 



Tl = {(Af, a:,0®) | if timeM(a^) < s then M{x) G L}. 



Notice that Tl is many-one reducible to L (without restriction assume L 0). 
Hence, the assumption that there is an optimal proof system for L implies that 
Tl has an optimal proof system, say h. Let 

S = {{{M, X, 0®), o') I 3w, |u>] < I, h{w) = {M, x, 0®)}. 



Clearly S € MV. Therefore by assumption there is a p-optimal proof system 
system g for S. Let now / be the following proof system. 




y if (/(w) = ((M, X, 0®), o'), timeM(a^) < s, andM(a;)=y, 
undef. otherwise. 



First notice that y G L if f{w) = y. This is due to the fact that g{w) = 
{{M, X, 0®), o') implies ((M, x, 0®), O') G S which in turn implies (M, x, 0®) G Tl. 
We now show that / p-simulates every proof system /' for L. Assume that 
/' is computed by the transducer My/ in polynomial time p{n). Observe that 
(My/, a;, OP^I®!^) G Tl for any x G S*. Hence, one may define a proof system 
for Tl such that for any x the tuple (My/, x, has the short proof lx. 

Consequently, due to the optimality of h, there is a polynomial q such that 
{Mfi,x, has an /i-proof of size < < 7 (|a;|). Hence {{Mfi,x, O^’^l^l^), 0®^l“l^) G 

S for any x, and one may define a proof system g' for S with g'{lx) = 
((My/,a;,0P(l"^l)),0«(l"^l)) for any X. As g is p-optimal, g p-simulates g', i.e. there 
is a function t G iFV such that g(t{lx)) = g'{lx) = ((My/, a;, 0^^!®^!)), 
Observe now that f{t{lx)) = f'{x) for any x. Hence / p-simulates /'. □ 
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As shown in [11], the assumption that SAT and TAUT both have p-optimal 
proof systems implies that NV n co-NV has a many-one complete set. In fact, 
due to Theorem 13 it suffices to assume that SAT has a p-optimal proof system 
and TAUT only has an optimal proof system. Together with Theorem 9 we 
obtain 

Corollary 14. If SAT has a p-optimal and TAUT has an optimal proof system 
then AfV H co-NV has a many-one complete set, and NVSVt has a many-one 
complete function. 

It is observed in [18] that the existence of an optimal proof system for TAUT 
implies the existence of a many-one complete pair for the class of disjoint AfV- 
pairs. In [11] it is shown that the same assumption allows one to infer the exis- 
tence of a strongly many-one complete disjoint A/”7^-pair which by Theorem 10 
implies that AfVSV has a many-one complete function. 

Corollary 15. If TAUT has an optimal proof system then AfVSV has a many- 
one complete function. 

Next we observe that a p-optimal proof system for SAT implies a complete 
function for the class NVAAVt. The proof uses ideas from [11] (in fact, the only 
extension is that we are dealing with multi-valued functions here). 

Theorem 16. If SAT has a p-optimal proof system then AfVAAVt has a many- 
one complete function. 

By Theorem 11 we obtain 

Corollary 17. If SAT has a p-optimal proof system then there exists a strongly 
many-one complete disjoint co-AfV-pair. 

7 Conclusion 

We showed that the assumption that certain proof systems are (p-)optimal can 
be used to derive collapse results. Also we presented some relations between 
completeness assumptions for different classes. It would be interesting to know 
whether these observations could be extended to further proof systems and 
promise classes. 
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Abstract. A general framework for typing graph rewriting systems is 
presented: the idea is to statically derive a type graph from a given 
graph. In contrast to the original graph, the type graph is invariant 
under reduction, but still contains meaningful behaviour information. 
We present conditions, a type system for graph rewriting should satisfy, 
and a methodology for proving these conditions. In two case studies it 
is shown how to incorporate existing type systems (for the polyadic tt- 
calculus and for a concurrent object-oriented calculus) into the general 
framework. 



1 Introduction 

In the past, many formalisms for the specification of concurrent and distributed 
systems have emerged. Some of them are aimed at providing an encompassing 
theory: a very general framework in which to describe and reason about intercon- 
nected processes. Examples are action calculi [18], rewriting logic [16] and graph 
rewriting [3] (for a comparison see [4]). They all contain a method of building 
terms (or graphs) from basic elements and a method of deriving reduction rules 
describing the dynamic behaviour of these terms in an operational way. 

A general theory is useful, if concepts appearing in instances of a theory can 
be generalised, yielding guidelines and relieving us of the burden to prove univer- 
sal concepts for every single special case. An example for such a generalisation 
is the work presented for action calculi in [15] where a method for deriving a 
labelled transition semantics from a set of reaction rules is presented. We concen- 
trate on graph rewriting (more specifically hypergraph rewriting) and attempt 
to generalise the concept of type systems, where, in this context, a type may be 
a rather complex structure. 

Compared to action calculi^ and rewriting logic, graph rewriting differs in a 
significant way in that connections between components are described explicitly 
(by connecting them by edges) rather than implicitly (by referring to the same 
channel name). We claim that this feature — together with the fact that it is easy 

* Research supported by SFB 342 (subproject A3) of the DFG. 

^ Here we mean action calculi in their standard string notation. There is also a graph 
notation for action calculi, see e.g. [7]. 
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to add an additional layer containing annotations and constraints to a graph — 
can simplify the design of a type system and therefore the static analysis of a 
graph rewriting system. 

After introducing our model of graph rewriting and a method for annotating 
graphs, we will present a general framework for type systems where both — 
the expression to be typed and the type itself — are hypergraphs and will show 
how to reduce the proof obligations for instantiations of the framework. We 
are interested in the following properties: correctness of a type system (if an 
expression has a certain type, then we can conclude that this expression has 
certain properties), the subject reduction property (types are invariant under 
reduction) and compositionality (the type of an expression can always be derived 
from the types of its subexpressions) . Parts of the proofs of these properties can 
already be conducted for the general case. 

We will then show that our framework is realistic by instantiating it to two 
well-known type systems: a type system avoiding run-time errors in the polyadic 
TT-calculus [17] and a type system avoiding “message not understood” -errors in 
a concurrent object-oriented setting. A third example enforcing a security policy 
for untrustworthy applets is included in the full version [11]. 



2 Hypergraph Rewriting and Hypergraph Annotation 



We first define some basic notions concerning hypergraphs (see also [6]) and a 
method for inductively constructing hyper graphs. 

Definition 1. (Hypergraph) Let L be a fixed set of labels. A hypergraph H = 
{Vh,Eh,sh,Ih,Xh) consists of a set of nodes Vh, a set of edges Eh, a con- 
nection mapping sh ■ Eh Vf^, an edge labelling Ih '■ Eh L and a string 
Xh € Vh of external nodes. A hypergraph morphism <f> : El ^ El' (consisting 
of 4>v ■ Vh Vh> and fin ■ Eh ^ Eh>) satisfies"^ 4>visH{e)) = SHfifisie)) 
and Inie) = lH'{4>E{e)). A strong morphism (denoted by the arrow -^) addition- 
ally preserves the external nodes, i.e. 4>v{xh) = Xh' ■ We write El = El' (H is 
isomorphic to H' ) if there is a bijective strong morphism from H to H' . 

The arity of a hypergraph H is defined as ar{H) = \xh\ while the arity of an 
edge e of is ar{e) = |s//(e)|. External nodes are the interface of a hypergraph 
towards its environment and are used to attach hypergraphs. 



Notation: We call a hypergraph discrete, if its edge set is 
empty. By m we denote a discrete graph of arity m G IN 
with m nodes where every node is external (see Figure (a) 
to the right, external nodes are labelled (1), (2), ... in 
their respective order). 

The hypergraph H = [?]„ contains exactly one edge e with 
label I where sh{o) = xh, ar{e) = n and^Vy = Sefixu) 
(see (b), nodes are ordered from left to right). 



(a) 


(1) 


[m 


o • 


O' 




(A) 


(n) 


(b) 


... 


■ 



I 



^ The application of fv to a string of nodes is defined pointwise. 
® Set(s) is the set of all elements of a string s 
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The next step is to define a method (first introduced in [10]) for the annota- 
tion of hypergraphs with lattice elements and to describe how these annotations 
change under morphisms. We use annotated hypergraphs as types where the 
annotations can be considered as extra typing information, therefore we use the 
terms annotated hypergraph and type graph as synonyms. 

Definition 2. (Annotated Hypergraphs) Let A he a mapping assigning a 
lattice A{H) = (/, <) to every hypergraph and a function A^ : A{H) — > A{H') 
to every morphism <f> : H ^ H' . We assume that A satisfies: 

A(j) o A^tjj — A(po'ip Aid^ — A(p(^a V 6) — ^ 0 ( 0 .) V Al0(6) Al0(-L) — _L 

where V is the join-operation, a and b are two elements of the lattice A{H) and 
_L is its bottom element. 

If a G A{H), then H[a] is called an annotated hypergraph. And 4> '■ H[o] 
H'[a'] is called an A-morphism if <f> : H ^ H' is a hypergraph morphism and 
A^{a) < a' . Furthermore H[a] and H'[a'] are called isomorphic if there is a 
strong hijective A-morphism (j) with A,p{a) = a' between them. 

Example: We consider the following annotation mapping A', let 

{{false, true}, <) be the boolean lattice where false < true. We define A{H) 
to be the set of all mappings from Vh into {false, true} (which yields a lattice 
with pointwise order). By choosing an element of A{H) we fix a subset of the 
nodes. So let a : Vh ^ {false, true} be an element of A{H) and let (j) : H ^ H', 
v' G Vh. We define: A(j,{a) = a' where a'{v') = V0(«)=«' a{v). That is, if a node 
v with annotation true is mapped to a node v' by <f>, the annotation of v' will 
also be true. 

From the point of view of category theory, A is a functor from the category 
of hypergraphs and hypergraph morphisms into the category of lattices and 
join-morphisms (i.e. functions preserving the join operation of the lattice). 

We now introduce a method for attaching (annotated) hypergraphs with a 
construction plan consisting of discrete graph morphisms. 

Definition 3. (Hypergraph Construction) Let Hi[ai], . . . , i7„[a„] he anno- 
tated hypergraphs and let Q : nii ^ D, 1 < i < n be hypergraph morphisms 
where ar{Hi) = rrii and D is discrete. Furthermore let 4>i \ ^ Hi he the 

unique strong morphisms. 

For this construction we assume that the node and edge sets of Hi, . . . , 
and D are pairwise disjoint. Furthermore let « he the smallest equivalence on 
their nodes satisfying Ci{v) « (jiiv) if I < i < n, v G Vhii- The nodes of the 
constructed graph are the equivalence classes of ~. We define 

Y> I fTb I iTl 

®*=i(-^tC*) = {{yD'j[J^^^VHi)/^,[J^^_^EHi,SH,lH,XH) 

where snie) = [wi]® . . . [wfeja if e G Eh^ and SHi{e) = v\...Vk. Furthermore 
Inie) = iHi(e) if e G Eh^ . And we define xh = [wi]« ■ • ■ [wfe]« if Xd = vi . . .Vk. 

If n = 0, the result of the construction is D itself. 
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We construct embeddings (p : D ^ H and rg : Hi ^ H by mapping every 
node to its equivalence class and every edge to itself. Then the construction of 
annotated graphs can be defined as follows: 

®^^^mai],Q) = [VLA.(ai)' 

In other words: we join all graphs D, Hi, , Hn and fuse exactly the nodes 
which are the image of one and the same node in the mi, xd becomes the new 
sequence of external nodes. Lattice annotations are joined if the annotated nodes 
are merged. In terms of category theory, ® ._.^{Hi[ai],Q) is the colimit of the 
(i and the pi regarded as ^-morphisms {D and the mi are annotated with the 
bottom element _L). We do not mention this fact in the rest of the paper, but 
it is used extensively in the proofs (for the proofs and several examples see the 
full version [11]). 

We also use another, more intuitive notation for graph 
construction. Let Q : mi D, \ < i < n. 

Then we depict ® ^_.^{Hi,pi) by drawing the hypergraph 
(VD,{ei,...,e„},s//,;//,X£)) where s_y(ei) = C*(Xmi) and 
Ifi (^z) — ■ 

Example: we can draw Ci) where Ci)C 2 : n ^ n as in the picture 

above (note that the edges have dashed lines) . Here we fuse the external nodes of 
Hi and H2 in their respective order and denote the resulting graph by H1OH2. 
If there is an edge with a dashed line labelled with an edge [l]n we rather draw 
it with a solid line and label it with I (see e.g. the second figure in section 4.1). 

Definition 4. (Hypergraph Rewriting) LetTZ be a set of pairs {L,R) (called 
rewriting rules), where the left-hand side L and the right-hand side R are both 
hypergraphs of the same arity. Then is the smallest relation generated by 
the pairs of TZ and closed under hypergraph construction. 

In our approach we generate the same transition system as in the double- 
pushout approach to graph rewriting described in [2] (for details see [13]). 

We need one more concept: a linear mapping which is an inductively defined 
transformation, mapping hypergraphs to hypergraphs and adding annotation. 

Definition 5. (Linear Mapping) A function from hypergraphs to hypergraphs 
is called arity-preserving if it preserves arity and isomorphism classes of hyper- 
graphs. 

Lett be an arity-preserving function that maps hypergraphs of the form [l]n to 
annotated hypergraphs. Then t can be extended to arbitrary hypergraphs by defin- 
ing t{® ^_®li]m ,Ci)) = ® i^®{[h]ni), Ci) and is then called a linear mapping. 

3 Static Analysis and Type Systems for Graph Rewriting 

Having introduced all underlying notions we now specify the requirements for 
type systems. We assume that there is a fixed set TZ of rewrite rules, an anno- 
tation mapping A, a predicate X on hypergraphs (representing the property we 
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want to check) and a relation > with the following meaning: li H\>T where H is 
a hypergraph and T a type graph (annotated wrt. to A), then H has type T. It 
is required that H and T have the same arity. 

We demand that c> satisfies the following conditions: first, a type should con- 
tain information concerning the properties of a hypergraph, i.e. if a hypergraph 
has a type, then we can be sure that the property X holds. 

H\>T X{H) (correctness) (1) 

During reduction, the type stays invariant. 

H >T A H H' =A H' >T (subject reduction property) (2) 

From (1) and (2) we can conclude that H >T and H H' imply X(H'), that 
is X holds during the entire reduction. 

The strong ^-morphisms introduced in Definition 2 impose a preorder on 
type graphs. It should always be possible to weaken the type with respect to 
that preorder. 



H \>T A T ^_4 T' H >T' (weakening) (3) 

We also demand that the type system is compositional, i.e a graph has a type if 
and only if this type can be obtained by typing its subgraphs and combining these 
types. We can not sensibly demand that the type of an expression is obtained 
by combining the types of the subgraphs in exactly the same way the expression 
is constructed, so we introduce a partial arity-preserving mapping / doing some 
post-processing. 



yi:Hi>Ti > f{(^ (compositionality) (4) 

^ and/(©©(r„0 )) -.aT) 

A last condition — the existence of minimal types — may not be strictly needed 
for type systems, but type systems satisfying this condition are much easier to 
handle. 

typable 3T: {H t>T A {H >T' T (minimal types) (5) 

Let us now assume that types are computed from graphs in the following 
way: there is a linear mapping t, such that H t> if is defined, 

and all other types of H are derived by the weakening rule, i.e. is the 

minimal type of H . 

The meaning of the mappings t and / can be explained as follows: t is a 
transformation local to edges, abstracting from irrelevant details and adding 
annotation information to a graph. The mapping / on the other hand, is a global 
operation, merging or removing parts of a graph in order to anticipate future 
reductions and thus ensure the subject reduction property. In the example in 
section 4.1 / “folds” a graph into itself, hence the letter /. In order to obtain 
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compositionality, it is required that / can be applied arbitrarily often at any stage 
of type inference, without losing information (see condition (6) of Theorem 1). 

In this setting it is sufficient to prove some simpler conditions, especially the 
proof of (2) can be conducted locally. 

Theorem 1. Let A he a fixed annotation mapping, let f he an arity-preserving 
mapping as above, let t he a linear mapping, let X he a predicate on hypergraphs 
and let HoT if and only if f{t{H)) T. Let us further assume that f satisfies"^ 

/(©”=im.Ci)) = /(@L(/m),G)) (6) T^^T' ^ f{T)^^f{T') (7) 

Then the relation \> satisfies conditions (l)-(5) if and only if it satisfies 

f{t{H)) defined ^ X{H) (8) (L, R) e U f{t{R)) f{t{L)) (9) 

The operation / can often be characterised by a universal property with the 
intuitive notion that f{T) is the “smallest” type graph (wrt. the preorder 
for which T -^a f{T) and a property C hold. 

Proposition 1. Let C be a property on type graphs such that f{T) can he char- 
acterised in the following way: f(T) satisfies C, there is a morphism (j) : T 
/(T) and for every other morphism f : T ^a T' where C{T') holds, there is a 
unique morphism ip : f{T) ^a T' such that o <f> = f . Furthermore we demand 
that if there exists a morphism (j) : T ^a T' such that C{T') holds, then f{T) is 
defined. 

Then if f{T) is defined, it is unique up to isomorphism. Furthermore f sat- 
isfies conditions (6) and (1). 

4 Case Studies 

4.1 A Type System for the Polyadic tt- C alculus 

We present a graph rewriting semantics for the asynchronous polyadic 7r-calculus 
[17] without choice and matching, already introduced in [12]. Different ways of 
encoding the 7r-calculus into graph rewriting can be found in [21,5,4]. 

We apply the theory presented in section 3, introduce a type system avoiding 
runtime errors produced by mismatching arities and show that it satisfies the 
conditions of Theorem 1. Afterwards we show that a graph has a type if and 
only if the corresponding 7r-calculus process has a type in a standard type system 
with infinite regular trees. 

Defiuitiou 6. (Process Graphs) A process graph F is inductively defined as 
follows: P is a hypergraph with a duplicate-free string of external nodes. Fur- 
thermore each edge e is either labelled with {k,n)Q where Q is again a process 



^ In an equation T = T' we assume that T is defined if and only if T' is defined. And 
in a condition of the form T -^a T' we assume that T is defined if T' is defined. 
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graph, 1 < n < ar(Q) and 1 < fc < ar(e) = ar{Q) — n (e is a process waiting for 
a message with n ports arriving at its k-th node), with IQ where ar{Q) = ar(e) 
(e is a process which can replicate itself) or with the constant M (e is a message 
sent to its last node). 

The reduction relation is generated by the rules in (A) (replication) and by 
rule (B) (reception of a message by a process) and is closed under isomorphism 
and graph construction. 



(1) (m) (1) (m) 

(A) y t ... y 


( 1 ) (k) (m) (m+l)(m + r) 

(B) y ■■■ ‘j’ if n = r 


[ !Q ) — Qn( IQ ] 


{ (k,n)Q ] f M ] — Q 



A process graph may contain a bad redex, if it contains a subgraph corre- 
sponding to the left-hand side of rule (B) with n r, so we define the predicate 
X as follows: X{P) if and only if P does not contain a bad redex. 



We now propose a type system for process graphs by defining the mappings 
t and /. (Note that in this case, the type graphs are trivially annotated by _L, 
and so we omit the annotation mapping.) 

The linear t mapping is defined on the hyper- 
edges as follows: = [ 0 ]„ (O is a new 

edge label), t{[\Q]m) = t{Q) and t{[{k,n)Q]m) 
is defined as in the image to the right (in the 
notation explained after Definition 3) . It is only 
defined if n -I- m = ar{Q). 

The mapping / is defined as in Proposition 1 where C is defined as follows® 

C'(T) Vei,C 2 G At: (Lsr(ei)Jar(ei) = LsT(e 2 )Jar(e 2 ) ^ 61 = 62 ) 

The linear mapping t extracts the communication structure from a process 
graph, i.e. an edge of the form [ 0 ]„ indicates that its nodes (except the last) 
might be sent or received via its last node. Then / makes sure that the arity of 
the arriving message matches the expected arity and that nodes that might get 
fused during reduction are already fused in f{t{Pl)). 

Proposition 2. The trivial annotation mapping A (where every lattice consists 
of a single element T), the mappings f and t and the predicate X defined above 
satisfy conditions (6)-(9) of Theorem 1. Thus if P>T, then P will never produce 
a bad redex during reduction. 

We now compare our type system to a standard type system of the 7 r-calculus. 
An encoding of process graphs into the asynchronous 7 r-calculus can be defined 
as follows. 

Definition 7. (Encoding) Let P be a process graph, let N be the name set of 
the TT-calculus and let t G Af* such that |t| = ar{P). We define Of{P) inductively 
as follows: 

® (sji extracts the i-th element of a string s. 



t{[{k,n)Q]m) = 
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6>ai...a„+i([M]„+i) = a„+i(ai, . . . ,a„) 6>t([!(3]m) =\Oi{Q) 

^ai. ..am ( [(^? ^)Q]m ) — (^1 j ■ ■ ■ j ^n) - ^ a\. . .a^a:i ...Xn (^) 

= ('^M(^i>\'S'ei(xD)))(0^(^i(;^„j)(Pi) I ... I 6*M(Cn(xm„))(^")) 

where Q : mi ^ D, 1 < i < n and /x : Vd ^ JV is a mapping such that 
pL restricted to VnXSet^XD) is injective, pt{VD\Set{xD)) H p,{Set{xD)) = 0 and 
h'i.Xo) = t. Furthermore the x\, . . . ,Xn G A/” are fresh names. 

The encoding of a discrete graph is included in the last case, if we set n = 0 
and assume that the empty parallel composition yields the nil process 0. 

An operational correspondence can be stated as follows: 

Proposition 3. Let p be an arbitrary expression in the asynchronous polyadic 
TT-calculus without summation. Then there exists a process graph P and a du- 
plicate-free string t € Af* such that 0{(P) = p. Furthermore for process graphs 
P, P' and for every duplicate-free string t G Af* with |t| = ar{P) = ar(P') it is 
true that: 

— P = P' implies Of{P) = Of(P') — P P' implies Of{P) Of{P) 

— Of{P) p wrong implies that P Q and Of{Q) = p for some process 
graph Q. 

— Of{P) wrong if and only if P — >* P' for some process graph P' containing 
a bad redex 

We now compare our type system with a standard type system of the tt- 
calculus: a type tree is a potentially infinite ordered tree with only finitely many 
non-isomorphic subtrees. A type tree is represented by the tuple [ti, . . . , t„] where 
t\, . . . ,tn are again type trees, the children of the root. A type assignment F = 
x\ t\, . . . ,Xn '■ tn assigns names to type trees where F{xi) = U. The rules of the 
type system are simplified versions of the ones from [19], obtained by removing 
the subtyping annotations. 

P\-p F\- q F\-p F,a:t\- p 

FGp\q F\-\p FG{iya)p 

T(O') — [tl,...,t77i] P, Xi . 1 1 , . . . , Xm . tm h p — [T(gi ) . . . , Pi^Qm^] 

p\- a{xi,...,Xm)-P F\-d{ai,...,am) 

We will now show that if a process graph has a type, then its encoding has 
a type in the 7r-calculus type system and vice versa. In order to express this we 
first describe the unfolding of a type graph into type trees. 

Proposition 4. Let T be a type graph and let a be a mapping from Vr into the 
set of type trees. The mapping a is called consistent, if it satisfies for every edge 
e G Et-' st(c) = vi ...VnV cr(v) = [(j(fi), . . . , cr(z;„)]. Every type graph of 
the form f{t{P)) has such a consistent mapping. 

Let P >T with n = ar{T) and let a be a consistent mapping for T. Then it 
holds for every duplicate-free string t of length n that [tj i : o’([y 7 ’J i), . . . , [tj„ : 

a{[XT\n)^Oi{P). 
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Now let r h 0i{P). Then there exists a type graph T such that PoT and 
a consistent mapping a such that for every \ < i <\t\ it holds that = 

nm- 

4.2 Concurrent Object-Oriented Programming 

We now show how to model a concurrent object-oriented system by graph rewrit- 
ing and then present a type system. In our model, several objects may compete 
in order to receive a message, and several messages might be waiting at the 
same object. Typically, type systems in object-oriented programming are there 
to ensure that an object that receives a message is able to process it. 

Definition 8. (Concurrent object-oriented rewrite system) Let (C, <:) 

he a lattice of classes with a top class^ T and a bottom class _L. We denote 
classes by the letters A, B,C, . . .. Furthermore let A4 be a set of method names. 
The function ar : C U A4 1N\{0} assigns an arity to every class or method 
name. 

An object graph G is a hypergraph with a duplicate-free string of external 
nodes, labelled with elements o/C\{_L}UA4 where for every edge e it holds that 
ar{e) = ar^lcie)). A concurrent object-oriented rewrite system (specifying the 
semantics) consists of a set of rules TZ satisfying the following conditions: 

— the left-hand side of a rule always has the form shown in Figure (C) below 
(where A G C\{_L}, ar{A) = n, m G Ai, ar{m) = k-\-l). 

The right-hand side is again an object (1) (n) (n -i- 1) (n -i- k) 

graph of arity n -\- k. If a left-hand side ^ ^ (C) 

Ra m exists, we say that A understands P — ^ ^ r ' ^ ^ 

m.’ i ^ I I ^ I = 

— If A <: B, A ^ 1. and B understands m, then A also understands m. 

— For all m G A4, either {A | A understands m} is empty or it contains a 
greatest element. 

An object graph G contains a “message not understood” -error if G contains a 
subgraph RA,m, but A does not understand m. 

Thus the predicate X for this section is defined as follows: X{G) if and only 
if G does not contain a “message not understood” -error. 

In contrast to the previous section, we now use annotated type graphs: the 
annotation mapping A assigns a lattice ({a : Vh ^ C x C}, <)) to every hyper- 
graph H. The partial order is defined as follows: ai < 02 Vf:(ai(w) = 

(Ai,B 2 ) a a 2 {v) = (^ 2 ,^ 2 ) Ai <: A 2 A Bi :> B 2 ), i.e. we have covariance 
in the first and contravariance in the second position. If a node v is labelled 
{A, B), this has the following intuitive meaning: we can accept at least as many 
messages as an object of class A on this node and we can send at most as many 
messages as an object of class B can accept. 

® This corresponds to the class Object in Java 
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Furthermore we define A^{a){v') = a{v) where (j> : H ^ H' , a is an 

element of A{H) and v' G Vh'- 

We now define the operator /: let T[a] be a type graph of arity n where 
it holds for all nodes v that a{v) = (A,B) implies A <: B (otherwise / is 
undefined). Then / reduces the graph to its string of external nodes, i.e f{T[a\) = 
n[b] where 6([x„J*) = a([xTji)- 

The linear mapping t determines the type of a class or method. It is necessary 
to choose a linear mapping that preserves the interface of left-hand and right- 
hand sides, i.e. we can use any t that satisfies condition (9) and the following 
two conditions below for A G C\{T} and m G A4: 

= [A]n[a] where a([x[A]„Ji) > (A,T) 
t{[m]n) = where a{lx[m]„\n) > (T,max{i3 | B understands m}) 

Proposition 5. The annotation mapping A, the mappings f and t and the 
predicate X defined above satisfy conditions (6)-(9) of Theorem 1. Thus ifGoT, 
then G will never produce a “message not understood” -error during reduction. 

In this case we do not prove that this type systems corresponds to an object- 
oriented type system, but rather present a semi-formal argument: we give the 
syntax and a type system for a small object calculus, and furthermore an en- 
coding into hypergraphs, without really defining the semantics. For the formal 
semantics of object calculi see [20,9], among others. 

An expression e in the object calculus either has the form new A(ei, . . . , e„) 
where A G C\{T} and ar{A) = n -I- 1 or e.m(ei, . . . , e„) where m G Ai and 
ar{m) = n 2. The are again expressions. Every class A is assigned an 
(ar{A) — l)-tuple of classes defining the type of the fields of A (A : (Ai, . . . , A„)) 
and every method m with ar(m) = n -\-2 defined in class B is assigned a type 
B.m : Cl, . . . , Cn ^ C. If a method is overwritten in a subclass it is required to 
have the same type. A simple type systems looks as follows: 

e : A, Ac B A : (Ai, . . . , A„), g : A, e : B, B.m : Ci, . . . , C„ ^ C, Cj : G^ 
e : B new A(ei, . . . , e„) : A e.m(ei, . . . , e„) : C 

Now an encoding [[•]] can be 
defined as shown in the figure 
to the right. We introduce the 
convention that the penulti- 
mate node of a message can 
be used to access the result 
after the rewriting step. 

If A : (Ai,...,A„) we define t in such a way that the n -I- 1 external 
nodes of t([A]„+i) are annotated by (A, T), (T,Ai), . . ., (T,A„). And if i?.m : 
Cl, . . . , C„ ^ C (where B is the maximal class which understands method m), 
we annotate the external nodes of t{[m]n+ 2 ) by (T,Ci), . . ., (T,C„), (C, T), 
(T,S). Now we can show by induction on the typing rules that if e : A, then 
there exists a type graph T[a] such that [[e]] c> T[a] and o([xtJ i) = (A, T). 
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5 Conclusion and Comparison to Related Work 

This is a first tentative approach aimed at developing a general framework for 
the static analysis of graph rewriting in the context of type systems. It is obvious 
that there are many type systems which do not fit well into our proposal. But 
since we are able to capture the essence of two important type systems, we 
assume to be on the right track. 

Types are often used to make the connection of components and the flow 
of information through a system explicit (see e.g. the type system for the tt- 
calculus, where the type trees indicate which tuple of channels is sent via which 
channel). Since connections are already explicit in graphs, we can use them both 
as type and as the expression to be typed. Via morphisms we can establish a 
clear connection between an expression and its type. Graphs are furthermore 
useful since we can easily add an extra layer of annotation. 

Work that is very close in spirit to ours is [8] by Honda which also presents a 
general framework for type systems. The underlying model is closer to standard 
process algebras and the main focus is on the characterisation and classification 
of type systems. 

The idea of composing graphs in such a way that they satisfy a certain 
property was already presented by Lafont in [14] where it is used to obtain 
deadlock-free nets. 

In graph rewriting there already exists a concept of typed graphs [1] , related 
to ours, but nevertheless different. In that work, a type graph is fixed a priori and 
there is only one type graph for every set of productions. Graphs are considered 
valid only if they can be mapped into the type graph by a graph morphism (this 
is similar to our proposal) . In our case, we compute the type graphs a posteriori 
and it is a crucial point in the design of every type system to distinguish as many 
graphs as possible by assigning different type graphs to them. 

This paper is a continuation of the work presented in [10] where the idea of 
generic type systems for process graphs (as defined in section 4.1) was introduced, 
but no proof of the equivalence of our type system to the standard type system 
for the TT-calculus was given. The ideas presented there are now extended to 
general graph rewriting systems. 

Further work will consist in better understanding the underlying mechanism 
of the type system. An interesting question in this context is the following: given 
a set of rewrite rules, is it possible to automatically derive mappings / and t 
satisfying the conditions of Theorem 1? 
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Abstract. The definition of open bisimilarity on the y-processes does 
not give rise to a sensible relation on the x-processes with the mismatch 
operator. The paper proposes ground open congruence as a principal 
open congruence on the X"Processes with the mismatch operator. The 
algebraic properties of the ground congruence is studied. The paper also 
takes a close look at barbed congruence. This relation is similar to the 
ground congruence. The precise relationship between the two is worked 
out. It is pointed out that the sound and complete system for the ground 
congruence can be obtained by removing one tau law from the complete 
system for the barbed congruence. 



1 Introduction and ^-Calculus with Mismatch 

The TT-calculus ([6]) is a powerful process calculus. The expressiveness is partly 
supported by input processes of the form a{x).P and output processes of the 
form ax.P. The former may receive a name at channel name a before evolving 
as P with X replaced by the received name. The latter can emit a; at a and then 
continues as P. The expressiveness is also supported by processes of the form 
{x)P. The localization operator (x) encapsulates the name x in P. In y-calculus 
([1,2, 3, 4]) the input and output processes are unified as a[x].P, in which a stands 
for either a name or a coname. 

Formally y-processes are defined by the following abstract syntax: 

P := 0 I a[x].P I P\P I {x)P I [x=y]P \ P+P 

where a G Af U 77. Here M is the set of names ranged over by small case letters. 
The set {x \ x G N} of conames is denoted by N. The name x in {x)P is 
local. A name is global in P if it is not local in P. The global names, the local 
names and the names of a syntactical object, as well as the notations gn{7)i ^^(-) 
and n(_), are defined with their standard meanings. We adopt the a-convention 
widely used in the literature on process algebra. We do not consider replication 
or recursion operator since it does not affect the results of this paper. 

* The author is funded by NNSFC (69873032) and 863 Hi-Tech Project (863-306- 
ZT06-02-2). He is also supported by BASICS, Center of Basic Studies in Computing 
Science, sponsored by Shanghai Education Committee. BASICS is affiliated to the 
Department of Computer Science at Shanghai Jiaotong University. 
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The following labeled transition system defines the operational semantics of 
X-calculus, in which symmetric rules are systematically omitted. In the following 
rules the letter 7 ranges over the set {a{x),a[x] \ a e J\fUJ\f,x G Af} U {r} 
and the letter A over the set {a{x),a[x\,[y /x] \ a & N ,x,y & N} U {r}. 
The symbols a{x), a[x], [y/x] represent restricted action, free action and update 
action respectively. The x in the label a{x) is local. 



Sequentialization 



r 1 D o 
a[x\.F — > F 



Composition 

P — ^ P' ln{j) n gn{Q)=tl) 
P\Q^P'\Q 

Communication 



Cmpo 



Sqn 



p p/ 

P\Q^^ P'lQiy/xf'^P^ 



Q^Q' 



P\Q^P'[y/x]\Q^ 



-pCmmo 



P^P' Q^Q' 



P\Q ^ (x)(P'|Q0 



jrpp-Cmmi 



p ^ pf 



Q ^ Q X ^ y 



P\Q 

Localization 



[vC 



P'[y/x\\Q'[ylx] 



Cmm2 



p p/ 






P\Q^P'\Q' 



Cmm^ 



P 



P' 



'n(A) 



p p/ 



{x)P^ 

Condition 

Summation 



{x)P' 



LoCq 



x ^ {a, a} 



(x)P^P' 



Loc^ 



P 



[y/a 



P' 



{x)P 



P 



7L0C2 



P 



P' 



[x=x]P 



p 



X ^ p/Mtch 



P' 



P+Q ^ p'Sum 

A substitution is a function from Af to Af that is identical on all but a finite 
number of names. Substitutions are usually denoted by a, a' , . . .. The notations 

and are used in their standard meanings. 

We will use two induced prefix operators, tau and update prefixes, defined 
as follows: [y\x].P (a)(a[y]|a[a:].P) and t.P {b)[b\b].P where a,b are fresh. 

The subject language of this paper is y^-calculus, the y-calculus with the 
mismatch operator. The operational semantics of the mismatch combinator is 
defined as follows: 



P 



P' x^y 



[x^y]P 



P' 



Mismtch 




The Ground Congruence for Chi Calculus 387 



The set of x^-processes is denoted by C. Suppose F is a finite set {y \, . . . , y„} of 
names. The notation [y0f]P will stand for [y^yi] ■ ■ ■ [y^yn]P, where the order 
of mismatch operators is immaterial. We will write (j) and ip, called conditions, to 
stand for sequences of match and mismatch combinators concatenated one after 
another, /i for a sequence of match operators, and 6 for a sequence of mismatch 
operators. Consequently we write ipP, y.P and 6P. When the length of ip (/r, i5) 
is zero, ipP {y^P, 6P) is just P. The notation <p ^ ip says that (p logically implies 
Ip and 4> Ip that (p and ip are logically equivalent. A substitution cr agrees with 
Ip, and Ip agrees with a, when ip x=y if and only if a(x)=a{y). 

Bisimulation equivalence relations on mobile processes are a lot more complex 
than those on CCS processes. The complication is mainly due to the dynamic 
aspect of mobile processes. The names in a process are subject to updates during 
the evolution of the process. These updates could be caused either by actions 
in which the process participates or by changes incurred by environments. A 
sensible observational equivalence for mobile processes must take that into ac- 
count. To illustrate what kind of relations one would obtain if s/he ignored the 
mobility, we introduce the following definition for y-calculus: 

Definition 1. Let TZ be a symmetric binary relation on the set of x-P^ocesses. 
It is called a naked bisimulation if whenever PTZQ and P P' then some Q' 

exists such that Q Q'TZP' . The naked bisimilarity « is the largest naked 
bisimulation. 

It is obvious that the definition of « is simply a reiteration of the weak bisim- 
ilarity of CCS in terms of the operational semantic of X“Calculus. However the 
naked bisimilarity is not a good equivalence relation since it is not closed under 
the parallel composition. For instance one has a[x]|6[?/] « a[a:].5[y]-|-6[y].a[a:] but 
not (a[a;]|6[?/])|(c[a]|c[6]) « (a[a;].6[?/]-|-6[?/].a[a;])|(c[a]|c[6]). Process equivalence is 
observational equivalence. One of the defining properties for an observational 
equivalence is that the equivalence should be closed under parallel composition. 
In [1, 2, 3, 4], it has been argued that bisimulation equivalences for x-calculus are 
closed under substitution. This suggests to introduce the following definition: 

Let TZ he a, symmetric binary relation on the set of x~processes that is 

closed under substitution. It is called an open bisimulation if whenever 

PTZQ and P P' then some Q' exists such that Q Q'TZP' . The 

open bisimilarity «o is the largest open bisimulation. 

The open bisimilarity «o has been studied in [1,2, 3,4] in both the symmetric 
and the asymmetric frameworks. It must be pointed out that the investigations 
carried out in [1,2, 3, 4] are for the X“Calculus without the mismatch combina- 
tor. For the X“Calculus with the mismatch operator, one should ask the question 
whether the open bisimilarity «o is a sensible equivalence. In [5] the present 
authors have given a negative answer to the question. As it turned out the 
open bisimilarity defined above is not closed under parallel composition in x^- 
calculus! One has [x^y]a[x].P + a[x].[x^y]T.P «o a[x].[x^y]T.P but it is clear 
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that a[y\\{[x^y]a[x\.P + a[x\.[x^y]T.P) 9^0 a[y\\{a[x\.[x^y]T.P). This is a serious 
problem because closure under parallel composition is an intrinsic property of 
observational equivalence. In [5] we have studied the problem and introduced two 
modified open congruences. These are early open congruence and late open con- 
gruence. Their relationship strongly recalls that between the weak early equiv- 
alence and the weak late equivalence ([6]). It should be said however that both 
the early open congruence and the late open congruence are the obvious mod- 
ifications with motivation from 7r-calculus. They are not the open congruence 
for the y-calculus with the mismatch operator. What is then the principal open 
congruence for y-calculus with the mismatch combinator? We will give our an- 
swer to the question in this paper. The way to arrive to the definition of the 
open congruence is via a particular naked bisimulation. In order to define this 
relation we need the notion of contexts defined as follows: (i) [] is a context; (ii) 
If C\\ is a context then a[x].C'[], C[]|P, P|C[], and [a;=?/]C'[] are contexts. 

Definition 2 . The ground hisimilarity is the largest naked bisimulation that 
is closed under context. 

In the above definition the requirement of closure under the prefix operator is 
reasonable since it is equivalent to that of closure under substitution. We will 
give an equivalent characterization of ~g in the style of open semantics, which 
we argue is the principal open bisimilarity. 

As it turns out the equivalence is very similar to the barbed bisimilarity 
of the y-calculus with the mismatch operator. The difference is very subtle. The 
barbed bisimilarity also has an equivalent open characterization. The similarity 
and the difference between the ground bisimilarity and the barbed bisimilarity 
are revealed through their open characterizations. 

This paper continues the work of [5] by studying the ground congruence and 
the barbed congruence for the y^-calculus. The main contributions of this paper 
are as follows: 

— We give an alternative characterization of the weak barbed bisimilarity. This 
characterization points out the complex nature of the weak barbed bisimi- 
larity. Many unknown equalities are discovered. A complete system for the 
weak barbed congruence is provided. The new tau laws used to establish the 
completeness result are surprisingly complex. 

— We study what we call ground open bisimilarity. A complete system for the 
ground open congruence is given. The relationship between the ground open 
congruence and the weak barbed congruence is revealed. 

Due to space limitation, all proofs have been omitted. 

2 Barbed Congruence 

The barbed equivalence is often quoted as a universal equivalence relation for 
process algebras. For a specific process calculus barbed equivalence immediately 
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gives rise to an observational equivalence. For two process calculi barbed equiv- 
alence can be used to compare the semantics of the two models. Despite the 
universal nature, barbed equivalence can have quite different displays in differ- 
ent process calculi. The barbed equivalence for the y-calculus has brought some 
new insight into the calculi of mobile processes. In this section we demonstrate 
that the barbed equivalence for the y^-calculus is even more different. A char- 
acterization theorem for the barbed bisimilarity on y^-calculus is given. Some 
illustrating pairs of barbed equivalent processes are given. First we introduce 
the notion of barbedness. 

Definition 3 . A process P is strongly barbed at a, notation Pia, if P P' 

or P — !-i P' for some P’ such that a G {a, a}. P is barbed at a, written Pifa, 
if some P' exists such that P P'ia. A binary relation TZ is barbed if\/a G 
Af.Pi)-a QlJ-a whenever PTZQ. 

From the point of view of barbed equivalence an observer can not see the content 
of a communication. What an observer can detect is the ability of a process to 
communicate at particular channels. Two processes are identified if they can 
simulate each other in terms of this ability. 

Definition 4. Let TZ be a barbed symmetric relation on C closed under context. 
The relation TZ is a barbed bisimulation if whenever PTZQ and P — ^ P' then 
Q Q'TZP' for some Q' . The barbed bisimilarity is the largest barbed 
bisimulation. 

The trade-off of the simplicity of the above definition is that it provides 
little intuition about equivalent processes. We know that it is weaker than most 
bisimulation equivalences. But we want to know how much weaker it is. We first 
give some examples of barbed equivalent processes. To make the examples more 
readable, we will write A PTZ{A+Q) for PTZ{P+Q), where P is a binary 
relation on processes. The first example of an equivalent pair is this: 

Ai = a[x].{Pi + [x=yi]T.Q)+a[x].{P2 + [x^yi]T.Q) Ai + a[x\.Q 

If a[x\.Q on the right hand side is involved in a communication in which x is 
replaced by y\ then a[x\.{Pi + [x=yi\T.Q) can simulate the action. Otherwise 
a[x].{P2+[x^yi\T .Q) would do the job. The second example is more interesting: 

A2 = {z)a[z].{Pi + [z=y2][z\x].Q)+a[x].{P2+[x^y2]T.Q[x/z]) 

A2 + a{x\.Q{xlz\ 

The communication a[?/2]|(a:)(A2-l-a[a;].Q[a^/2]) — ^ Q\Q[x / z][y2 / x] for instance 
can be matched up by a [2/2] | (a:) A2 ^0 1 (x) (Pi [2/2 / z] + [222=2/2] [//2 \x].Q[y2/ z\) ^ 
0 IQ [2/2/2] [2/2/a;]. The third example is unusual: 

A3 = a[y3].{Pi + [y:i\x].Q)+a[x].{P2 + [x^y:i]T.Q) A3 -k a[a;].Q 
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If a[x\.Q participates in a communication in which x exchanges for then its 
role can be simulated by a[y3].{Pi + [y3\x].Q) . The fourth is similar: 

A4 = [y4\x].{Pi+a[y4].Q)+a[x].{P2 + [x^y4]T.Q) ^4 + a[a;].Q 

If {y4){{A4+a[x].Q)\a[y4\.0) — ^ Q[x / y^\ 0 [x / y4\ then the simulation is: 

{y 4 ){A 4 \a[y 4 ]. 0 )A^{Pi[x/y 4 ]+a[x\.Q[x/y 4 ])\a[x\. 0 [x/y 4 ]^Q[x/y 4 ]\ 0 [x/y 4 ] 

The fifth example is the combination of the fourth and the second: 

A5 [y5\x].{Pi+{z)a[z].{Pi+[z=y5]T.Q))+a[x].{P2 + [x^y5]T.Q[x/z]) 

A5 + a[x].Q[x/z] 

Notice that the component [y^\x\.{Pi + {z)a[z\.{P[+[z=y^]T.Q)) is operationally 
the same as the process [y^\x\.{Pi+{z)a[z\.{P[+[z=y^][z\x\.Q)). 

In the above examples, all the explicit mismatch operators contain the name 
x. In general there could be other conditions. The treatment of match operator 
is easy. The mismatch operator is however nontrivial. Suppose i5 is a sequence 
of mismacth operators such that all names in 5 are different from both x and z. 
An example more gerneral than A\ is this: 

a[x\.{Pi+5[x=yi]T.Q)+a[x\.{P2+5[x^yi]T.Q) + [x^n{ 5 )] 5 a[x\.Q 

We need to explain the mismatch sequence in [x^n{S)]Sa[x].Q. The <5 before 
a[x].Q is necessary for otherwise an action of {[x^n{6)]a[x].Q)(T may not be 
simulated by any action from A'^a when cr invalidates 5 . The [x^n{ 5 )] is nec- 
essary because otherwise it would not be closed under substitution. A counter 
example is the pair a[x\\y^z\[x=yi\T.Q+a[x\.[y^z\[x^yi\T.Q+[y^z\a[x].Q and 
a[x\\y^z\[x=yi\T.Q+a[x\.[y^z\[x^yi\T.Q. If we substitute x for z in the two 
processes we get two processes that are not barbed bisimilar. Similarly the ex- 
ample A2 can be generalized to the following: 

A'2 {z)a[z].{Pi + [x^n{6)]5[z=y2][z\x\.Q)+a[x].{P2+5[x^y2]T.Q[x/z]) 

A2 -I- [x^n{ 6 )] 5 a[x].Q[x/ z] 

The general form of A3 is more delicate: 

^3 [x^ys\a[yz\-{Pi + [x^n{5)]5[yz\x\.Q)+a[x\.{P2+5[x^y:i]T.Q) 

A3 -I- [x^y3][x^n{S)]Sa[x].Q 

In both [x^y3]a[y3].{Pi + [x^n{6)]S[y3\x].Q) and [x^y3\[x^n{5)\5a[x\.Q there is 
the mismatch [x^y^]. If this operator is removed from Ag one has 

B's a[y3\.{Pi + [x^n{5)]5[y3\x\.Q)+a[x\.{P2+5[x^y3\T.Q) 

9^6 B's + [x^n(S)]6a[x].Q 
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The inequality is clearer if one substitutes x for j /3 in the above: 

Cg a[x\.{Pi + [x^n{5)]6[x\x\.Q)+a[x].{P2+6[x^x\T.Q) 

9^6 Cg + [x^n{S)]6a[x].Q 

The component [x^n(S)]6a[x].Q may be involved in a communication in which x 
is replaced by a name in 6. This action can not be simulated by Cg. The general 
forms of A4 and A5 are as follows: 

A'4 [y4\x\.{Pi+5a[y4\.Q)+a[x\.{P2+5[x^y4\T.Q) ^ 4 + [x^n{5)]5a[x\.Q 

^5 [2/5k]-(Ci + (z)a[2].(P{+(5[z=y5]T.(5))+a[a;].(P2+i5[a;y^2/5]T.Q[a;/z]) 

A'^ + [x^n{6)]5a[x].Q[xl z] 

If we replace the second summand a[x\.{P2+5[x^y\]T.Q) of A\ by {z)a[z].{P2 + 
[x^n{5)]5[z^y\][z\x\.Q) and Q by Q[x/z], we get an interesting variant of A\ as 
follows: 

A'{ a[x].{Pi+6[x=yi]T.Q[x / z\)+{z)a[z].{P2+[x^n{5)]6[z^yi][z\x].Q) 

A” + [x^n{6)]6a[x].Q[x / z] 

The bisimilar pairs A'2 through A'^ have similar variants: 

A'2 {z)a[z].{Pi + [x^n{S)]S[z=y2][z\x].Q)+02 

A'2 + [x^n{6)]5a[x].Q[x/ z] 

Ag [x^ys\oi[y3\.{Pi + [x^n{5)]5[y:i\x\.Q[x / z\)+0'i 

A'l + [x^yz][x^n{6)]5a[x].Q[xl z] 

A'i [y4\x\.{Pi+5a[y4\.Q[x/ z])+04 

A'l + [x^n{6)]6a[x].Q[x / z] 

^5 ‘= [y5\x]-{Pi+{z)a[z].{P{+5[z=y5]T.Q[z/x]))+05 
A'l + [x^n{6)]5a[x].Q[xl z] 

where Oi is {z)a[z].{P2+[x^n{5)]6[z^yi][z\x].Q) for i € {2,3,4, 5}. The most 
complicated situation arises when all the five possibilities as described by A'( 
through A'l happen at one go: 

A {z)a[z\.{P2+[x^n{5)]5[z^{yi,y2,ys,yA,yb}][z\x\.Q) 
+a[x].{Pi+5[x=yi]T.Q[x / z]) 

+ {z)a[z\.{Pi + [x^n{5)]6[z=y2][z\x].Q) 

+ [x^y3]a[y:i]\Pi + [x^n{6)\5[y:i\x\.Q[x/z]) 

+ [y4,\x].{Pi+5a[y4].Q[x/ z\) 

+ [yb\x].{Pi+{z)a[z].{P[+5[z=y5]T.Q[z / x])) 
pzb A+ [x^y3][x^n{5)]6a[x].Q[x/ z] 
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Similarly the examples A'l through A!^ can be combined into one as follows: 

A' a[x\.{P2+5[x^{yi,y2,yz,yA,yb}]T.Q[x / z\) 

+a[x].{Pi+6[x=yi]T.Q[x/ z\) 
+{z)a[z].{Pi+[x^n{6)]5[z=y2][z\x].Q) 

+ [x^yz]a[y3\.{Pi + [x^n{5)\5[yi\x\.Q[xlz]) 

+ [y4\x].{Pi+5a[y4\.Q[x/ z\) 

+ [y 5 \x\-{Pi + {z)a[z].{P[+5[z=y^]T.Q[z / x])) 

«{, A' + [x^y^][x^n{5)]6a[x].Q[x/ z] 



Having seen so many bisimilar pairs of processes, the reader might wonder how 
we have discovered them. As a matter of fact these examples are all motivated by 
an alternative characterization of the barbed bisimilarity. This characterization 
is given by an open bisimilarity as defined below. 

Definition 5. Let TZ he a binary symmetric relation on C closed under substi- 
tution. The relation TZ is a barbed open bisimulation if the following properties 
hold for P and Q whenever PTZQ: 

(i) If X is an update or a tau and P P' then Q' exists such that Q Q'TZP' . 
(a) If P — P' then one of the following properties holds: 

— Q' exists such that Q Q'TZP' ; 

— Q' and Q" exist such that Q Q" and Q"[x/z\ Q'TZP' ; 

and, for each y different from x, one of the following properties holds: 

— Q' and Q" exist such that Q Q" and Q"[y/x\ Q'TZP'[y/x\; 

— Q' and Q" exist such that Q Q" and Q"[y/z] Q'TZP'[y/x\; 

— Q' exists such that Q Q'TZP' [y /x]; 

— Q' exists such that Q Q'TZP' [y /x]; 

— Q' and Q" exist such that Q q„ QH]^y i 

(Hi) If P P' then, for each y, one of the following properties holds: 

— Q' and Q" exist such that Q Q" and Q"[y / x\=^Q'TZP'[y / x\; 

— Q' exists such that Q Q'TZP'[y /x]. 

The barbed open bisimilarity is the largest barbed open bisimulation. 

With a definition as complex as Definition 5, it is not very clear if the relation 
it introduces is well behaved. The next lemma gives one some confidence on the 
barbed open bisimilarity. 

is closed under localization and composition. 



Lemma 6. 
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Since ^open is closed under substitution, it must also be closed under prefix 
operation. It is also clear that ~ope„ is closed under match operation. However 
the relation is closed neither under the mismatch operation nor under the sum- 
mation operation. For instance [xy^y]P ^open [^¥^y]T-P does not hold. To obtain 
the largest congruence contained in ^^pen use the standard approach. 

Definition 7. Two processes P and Q are barbed congruent, notation P Q, 
if P Q and for each substitution a whenever Per — > P' then Q' exists 

such that Qa Q' ^open P' versa. 

The notation is not confusing because it is also the largest congruence con- 
tained in «{,. This is guaranteed by the next theorem. 

Theorem 8. and coincide. 

3 Axiomatic System 

In this section we give a complete system for the barbed congruence on the 
finite x^-processes. In order to prove the completeness theorem, we need some 
auxiliary definitions. 

Definition 9. Let V be a finite set of names. We say that ip is complete on V 
if n{ip) C V and for each pair x, y of names in V it holds that either ip x=y 
or Ip ^ x^y. A substitution a is induced by ip, and ip induces a, if a agrees with 
Ip and acr = a. 

We now begin to describe a system complete for the barbed congruence. 
Let AS denote the system consisting of the rules and laws in Figure 2 plus the 
following expansion law: 

■Ki—ai [fCj] 

P\Q = '^Mx)'^i-{Pi\Q) + ^ (pii’j{S:){y)[ai=bj][xi\yj].{Pi\Qj) + 

* jj=bj[yj] 

'7ri—a^[xi] 

<l>i^jiS:){y)[ai=bj][xi\yj].{Pi\Qj) 

i lj=bj[yj] 

where P is J2i 4>t{x)Tri.Pi and Q is i>j{y)jj.Qj, tti and range over {a[x\ \ 
a & M yjJI,x € M}. 

Using axioms in AS, a process can be converted to a process that contains 
no occurrence of composition operator, the latter process is of special form as 
defined below. 

Definition 10. A process P is in normal form on V fn{P) if P is of the 
form W-Pi + Sie/a M^^\yi]-Pi thatx does 

not appear in P, (pi is complete on V for each i G Ii U I2 I3, Pi is in normal 
form on V for i G hU I3 and is in normal form on U U {a;} for i G I2. Here I\, 
I 2 and I 3 are pairwise disjoint finite indexing sets. 
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Tl 


X.T.P 


= X.P 




T2 


P+T.P 


= T.P 




T3 


X.{P+t.Q) 


= A.(F+t.Q)+A.Q 




T4 


T.P 


= T.{P+1pT.P) 




T5 


[y\x].{P+ST.Q) 


= [y\x].{P+Sr.Q)+ip6[y\x].Q 


C{iP,S) 


T6 


FF 


= FF+[x^Y3][x^n{5)]5a[x].Q[x/ z] 


z^n{5) 


T7 


FR 


= FR+[x^Y3][x^n{5)]Sa[x].Q[x/ z] 


z^n{S) 


TDl 


RO 


= RO+5{x)a[x].Q 


x^n(5) 



Fig. 1. Tau Laws 



The depth of a process measures the maximal length of nested prefixes in the 
process. The structural definition goes as follows: (i) d{0) = 0; (ii) d{a[x].P) = 
l+d{P); (iii) d{P\Q) = d{P)+dm (iv) d{{x)P) = d{P); (v) d{[x=y]P) = d{P), 
d{[xj^y]P) = d{P); (vi) d{P+Q) = max{d{P) , d{Q)} . 

Lemma 11. For a process P and a finite set V of names such that fn{P) C V 
there is a normal form Q on V such that d{Q) < d{P) and AS \~ Q = P. 

In order to obtain a complete system for the barbed congruence, we need some 
tau laws, some of which are new and complex. Figure 1 contains seven tau laws 
used in this paper. T4, introduced by the first author in previous publication, is 
a necessary law for open congruences. T5 holds under the condition C{rp,S): 

If 5 [u^v] then either [x=u][y^v] or ip ^ [x=v][y^u] or ip ^ 

[y=u][x^v] or Ip ^ [y=v][x^u] or ip ^ [x^u][x^v][y^u][y^v]. 

This law was used for the first time in [5]. The laws T 6 and T7 are equational 
formalization of the examples given in Section 2 in a more general form. In these 
axioms, FF (respectively FR) stands for 

a[x].{P+S[x^Yi U . . . U Y^]t.Q[x/ z\) 

(respectively {z)a[z\.{P+[x^n{5)]6[z0Si U . . . U ¥ 5 ] [z|a;].(5)) 
+Sy(iYia[x\.{Py+5[x=y]T.Q[x/ z\)+Ey(zY 2 {z)oi[z\.{Py+[x^n{ 5 )\ 5 [z=y\ [z\x].Q) 
+SyeY3[xT^y]a[y]-{Py+[x^n{S)]6[y\x].Q[x/z]) 
+PyeY 4 [y\x].{Py+ 6 a[y].{Py+ST.Q[x/z])) 
+PyeYAy\x]-{Py+S{z)a[z].{Py+5[z=y]T.Q)) 

These two laws are new. In TDl, which is derivable from T 6 , RO is 

Py&Yia[y].{Py+6T.Q[y/x]) + Sy(zY 3 {x)a[x].{Py+S[x=y]T.Q) 

+ {x)a[x].{P+5[x^Yi U Y 2 ]t.Q) 

Let AS U {Tl, T2, T3, T4, T5, T 6 , T7} denote ASp,. Without further ado, we 
state the main result of this section. 

Theorem 12. ASp is sound and complete for 
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El 


P = P 




E2 


P = Q 


if Q = P 


E3 


P = R 


if P = Q and Q — R 


Cl 


a[x].P = a[x].Q 


if p = g 


C2 


{x)P = {x)Q 


if p = g 


C3a 


[x=y]P = [x=y]Q 


if p = g 


C3b 


[xAy]P = [x7^y]Q 


if p = g 


C4 


P+R = Q+R 


if p = g 


C5 


Po\Pi = <3o|<3i 


if Po = Qo and Pi = gi 


LI 


(x)0 = 0 




L2 


{x)a[y].P = 0 


X € {a, a} 


L3 


{x)a[y].P = a[y].{x)P 


X ^ {y,a,a} 


L4 


{x){y)P = {y){x)P 




L5 


{x)[y=z\P = [y=z]{x)P 


X ^ {y,z} 


L6 


{x)\x=y]P = 0 


xAy 


L7 


{x){P+Q) = {x)P+{x)Q 




L8 


{x)[y\z\.P = [y\z].{x)P 


X ^ {y,zj 


L9 


{x)[y\x].P = T.P[y/x\ 


yAx 


LIO 


{x)[x\x].P = T.{x)P 




Ml 


(j>P = 'tpP 


if 0 


M2 


[x=y]P = [x=y\P[y/x] 




M3a [x=y]{P+Q) = [x=y]P +[x=y]Q 




M3b [x^y]{P+Q) = [x^y]P +[xAy]Q 




M4 


P = \x=y\P+[x^y]P 




M5 


[x^x\P = 0 




SI 


P+0 = P 




S2 


P +Q = Q+P 




S3 


p+(g+p) = (p+Q)+p 




S4 


p+p = p 




U1 


[y\x].P = [x\y].P 




U2 


[y\x\.P = [y\x\.[x=y\P 




U3 


[®|a;].P = r.P 





Fig. 2. Axiomatic System AS 



LDl 


{x)[x\x].P = 


[y\y].{x)P 


U3 and L8 


LD2 


{x)[yAz]P = 


[yi^z]{x)P 


L5, L7 and M4 


LD3 


ix)[xAy]P = 


{x)P 


L6, L7 and M4 


MDl 


[x=y].0 = 


0 


SI, S4 and M4 


MD2 


[x=x].P = 


P 


Ml 


MD3 


4>P = 


(j>{Pa) where a 


is induced by (j> M2 


SDl 


(f)p+p = 


P 


S-rules and M4 


UDl 


[y\x].P = 


[y\x].P[y/x] 


U2 and M2 



Fig. 3. Some Laws Derivable from AS 
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4 Ground Congruence 

In this section we sketch some main properties about Wg. First of all the ground 
bisimilarity can be characterized by an open bisimilarity called ground open 
bisimilarity, notation ~open- The definition of the ground open bisimilarity is 
obtained from Defintion 5 by replacing clause (ii) by 

(ii’) If P p' then Q' exists such that Q Q'TZP' . 

It is easy to prove that ^open is closed under localization and composition and 
that ~open coincides with By definition the ground open bisimilarity is con- 
tained in the barbed one. The inclusion is strict because T7 is not valid for 

open’ 

Let ~g be the largest congruence contained in K® formal definition is 

completely similar to that of Let AS^ stand for 4S'U{T1, T2, T3, T4, T5, T6}. 
It can be similarly proved that AS^ is sound and complete for 

5 Remark 

Parrow and Victor have studied fusion calculus ([7]). It is a polyadic version 
of x^-calculus. The main observational equivalence they have studied is what 
they call weak hyperequivalence. The weak hyperequivalence is essentially a 
polyadic version of the open bisimilarity ~o we have defined in the introduction. 
Since y^-calculus is a monadic version of the fusion calculus and therefore is a 
subcalculus of the latter, the counter example given in the introduction is valid 
in fusion calculus as well. One of the motivations of the ground bisimilarity is 
to rectify the weak hyperequivalence. Apart from its theoretical interest, the 
barbed bisimilarity is introduced partly to study the ground bisimilarity. 
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Abstract. We propose an object-oriented calculus with internal con- 
currency and class-based inheritance that is built upon the join calculus. 
Method calls, locks, and states are handled in a uniform manner, us- 
ing labeled messages. Classes are partial message definitions that can 
be combined and transformed. We design operators for behavioral and 
synchronization inheritance. Our model is compatible with the JoCaml 
implementation of the join calculus. 



1 Introduction 

Object-oriented programming has long been praised as favoring abstraction, in- 
cremental development, and code reuse. Objects can be created by instantiating 
definition patterns called classes, and in turn complex classes can be built from 
simpler ones. To make this approach effective, the assembly of classes should rely 
on a small set of operators with a clear semantics and should support modular 
proof techniques. In a concurrency setting, such promises can be rather hard to 
achieve. 

The design and the implementation of concurrent object-oriented languages, 
e.g. [2, 20, 1, 4], has recently prompted the investigation of the theoretical foun- 
dations of concurrent objects. Several works provide encodings of objects in pro- 
cess calculi [19, 18, 12, 5] or, alternatively, supplement objects with concurrent 
primitives [16, 3, 11]. Those works promote a unified framework for reasoning 
about objects and processes, but they do not address the composition of object 
definitions or its typechecking. 

In this work, we model concurrent objects in a simple process calculus — 
a variant of the join calculus [7, 6], we design operators for behavioral and 
synchronization inheritance, and we give a type system that statically enforces 
basic safety properties. 

The join calculus is a simple name-passing calculus, related to the pi cal- 
culus but with a functional flavor. It is the core of a distributed programming 
language, currently implemented as an extension of ML [8, 13]. In the join cal- 
culus, communication channels are statically scoped: when channels are created, 
their definition provides a set of reaction rules that specify, once for all, how 
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messages sent on these names will be synchronized and processed. Although the 
join calculus does not have a primitive notion of object, definitions encapsulate 
the details of synchronization, much as concurrent objects. 

Applying the well-known objects-as-records paradigm to the join calculus, we 
obtain a simple language of objects with asynchronous message passing. Method 
calls, locks, and states are handled in a uniform manner, using labeled messages. 
There is no primitive notion of functions, calling sequences, or threads (they can 
all be encoded using continuation messages). Our language — the objective join 
calculus — allows fine-grain internal concurrency, as each object may send and 
receive several messages in parallel. 

For every object of our language, message synchronization is defined and com- 
piled as a whole. This allows an efficient compilation of message delivery into 
automata [14] and simplifies reasoning on objects. However, the static definition 
of behavior can be overly restrictive for the programmer. This suggests some 
compile-time mechanism for assembling partial definitions. To this end, we pro- 
mote partial definitions into classes. Classes can be combined and transformed 
to form new classes. They can also be closed to create objects. 

The class language is layered on top of the core objective calculus, with a se- 
mantics that reduces classes into plain object definitions. We thus retain strong 
static properties for all objects at run-time. Some operators are imported from 
sequential languages and adapted to a concurrent setting. For instance, multiple 
inheritance is expressed as a disjunction of join definitions, but some disjunc- 
tions have no counterpart in a sequential language. In addition, we propose a 
new operator, called selective refinement. Selective refinement applies to a parent 
class, and rewrites the parent reaction rules according to their synchronization 
patterns. Selective refinement treats synchronization concretely, but it handles 
the parent processes abstractly. Our approach is compatible with the JoCaml 
implementation of the join calculus [13], which already singles out synchroniza- 
tion patterns using concrete compile-time representation, and, on the contrary, 
compiles behaviors into functional closures. 

Our design of the class language follows from common programming patterns 
in the join calculus. We motivate it further by coding some standard problematic 
examples that mix synchronization and inheritance. 

The language is equipped with a polymorphic type system, in the style of [9]; 
in addition to basic safety properties, the type system also enforces privacy. The 
formal presentation of both dynamic and static semantics, the soundness results, 
and their proofs are omitted from this extended abstract. They can be found in 
the full paper [10]. 

The paper is organized as follows. In section 2, we present the objective join 
calculus and develop a few examples. In section 3, we supplement the language 
with classes. In section 4, we provide more involved examples of inheritance and 
concurrency. In section 5, we discuss related works and possible extensions. 
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2 The Objective Join Calculus 

Getting Started. The basic operation of our calculus is asynchronous message 
passing in object style. For instance, the process out.print-int{n) sends a mes- 
sage with label print-int and with content n to an object named out, meant to 
print integers on the terminal. 

Accordingly, the definition of an object describes how messages received on 
some labels can trigger processes. For instance, 

obj continuation = reply{x) > out. print— int{x) 

defines an object that reacts to messages on reply by printing their content on 
the terminal. Another example is the rendez-vous, or synchronous buffer: 

obj sbuffer = get{r) & put{a,s) > r.reply(a) & s.replyQ 

The object sbuffer has two labels get and put; it reacts to the simultaneous 
presence of one message on each of these labels by passing a message to the 
continuation r, with label reply and contents a, and passing an empty message 
to s. (Object r may be the previously-defined continuation; object s is another 
continuation taking no argument on reply.) As regards the syntax, concurrent 
execution and message synchronization are expressed in a symmetric manner 
using the same infix operator &. Also, the calculus is polyadic, i.e., messages 
carry tuples of values. 

Some labels may convey messages representing the internal state of an ob- 
ject, rather than an external method call. This is the case of label Some in the 
following unbounded, unordered, asynchronous buffer: 

obj abuffer = self(2) 

put{a,r) > r.replyi) & z.Some{a) 
or get{r) & Some{a) > r.reply (a) 

The object abuffer can react in two different ways: a message (a, r) on put 
may be consumed by storing the value o in a self-inflicted message on Some; 
alternatively, a message on get and a message on Some may be jointly consumed, 
and then the value stored on Some is sent to the continuation received on get. 
The indirection through Some makes abuffer behave asynchronously: messages 
on put are never blocked, even if no message is ever sent on get. As regards the 
syntax, the prefix self(z) explicitly binds the name z to the defined object. 

In the example above, the messages on label Some encode the state of abuffer. 
The following definition illustrates a tighter management of state that imple- 
ments a one-place buffer: 

obj buffer = self(z) 

put{a,r) & EmptyQ \> r.replyf) & z.Some(a) 
or get{r) & Some{a) c> r.reply(a) & z. EmptyQ 
init buffer .EmptyQ 

Such a buffer can either be empty or contain one element. The state is encoded 
as a message pending on Empty or Some, respectively. Object buffer is created 
empty, by sending a first message on Empty in the (optional) init part of the 
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Fig. 1. Syntax for the core object calculus 



P : 




Processes 




0 


null process 




x.M 


message sending 




Pi & P2 


parallel composition 




obj X = self(2) D init Pi in P2 


object definition 


D : 


:= 


Definitions 




M>P 


reaction rule 




Di or P2 


disjunction 


M : 


:= 


Patterns 




Hu) 


message 




Ml & M2 


synchronization 



obj construct. As opposed to abuffer above, a put message is blocked when the 
buffer is not empty. 

To keep the buffer object consistent, there should be a single message pending 
on either Empty or Some. This invariant holds as long as external users cannot 
send messages on these labels directly. In the full paper [10], we describe a refined 
semantics and a type system that distinguishes private labels such as Empty and 
Some from public labels, and restrict access to private labels. In the examples, 
private labels conventionally bear an initial capital letter. 

Once private labels are hidden, each variant of our buffer provides the same 
interface to the outside world (two methods labeled get and put) but their con- 
current behaviors are very different. 



Syntax. We use two disjoint countable sets of identifiers for object names x,z,u € 
O and for labels € C. Tuples of names are written or simply x. The 

grammar of the objective join calculus (without classes) is given in Figure 1; 
it has syntactic categories for processes P, definitions D, and patterns M. We 
abbreviate obj x = self(z) Z? init P in Q by omitting self(z) when 2 does not 
occur free in D and omitting init P when P is 0. 

A reaction rule M t> P associates a pattern M with a guarded process P. 
Every message pattern £{u) in M binds the object names u with scope P. We 
require that every pattern M guarding a reaction rule be linear; that is, each 
label and each name appears at most once in M . This will be enforced by typing. 
In addition, an object definition obj x = self(z) D init Pi in P2 binds two names 
X and z to D. The scope of x is the processes Pi and P2] the scope of 2 is every 
guarded process in D. Free names in processes and definitions, noted &(•), are 
defined according to these binders. Terms are taken modulo renaming of bound 
names (or a-conversion) . 
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The reduction relation on processes is defined using a refiexive chemical ab- 
stract machine; it appears in the full paper. 

3 Inheritance and Concurrency 

We now extend the calculus of concurrent objects with classes and inheritance. 
The behavior of objects in the join calculus is statically defined: once an object 
is created, it cannot be extended with new labels or with new reaction rules 
synchronizing existing labels. Instead, we provide this flexibility at the level of 
classes. Our operators on classes can express various object paradigms, such as 
method overriding (with late binding) or method extension. As regards concur- 
rency, these operators are also suitable to define synchronization policies in a 
modular manner. 

Deriving a Concurrent Class. We introduce the syntax for classes in a series of 
simple examples. We begin with a class buffer behaving as the one-place buffer 
of Section 2: 

class buffer = self(2:) 

get{r) & Some{a) c> r.reply(a) & z.EmptyQ 
or put{a,r) & EmptyQ \> r.replyQ & z.Some(a) 

The class buffer can be used to create objects: 
obj b = buffer init b. EmptyQ 

Assume that, for debugging purposes, we want to log the buffer content on 
the terminal. Our first solution uses an explicit log label. 

class logged-buffer = self(2:) 
buffer 

or log{) & Some{a) > out.print-int{a) & z.Some(a) 

or log{) & EmptyQ i> out. printstring {’’Empty”) & z.EmptyQ 

The class body above is a disjunction of an inherited class and of additional 
reaction rules. The intended meaning of disjunction is that reaction rules are 
cumulated, yielding competing behaviors for messages on labels that appear in 
several disjuncts. The order of the disjuncts does not matter. The programmer 
that writes logged-buffer must have some knowledge of the parent class buffer, 
namely the use of private labels Some and Empty for representing the state. 

Other possible debugging information is the synchronous log of all messages 
that are consumed on put. This is done by selecting the patterns in which put 
occurs and adding a printing message to the corresponding guarded processes: 

class logged— buffer-bis = 
match buffer with 

put{a,r) => put{a,r) \> out.print-int{a) 

end 

The match construct can be understood by analogy with pattern matching d 
la ML, applied to the reaction rules of the parent class. Here, any reaction rule 
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from the parent buffer whose pattern contains the label put is replaced in the 
derived logged-buffer-bis by a rule with the same pattern {put appears on both 
sides of =^) and with the original guarded process in parallel with a printing 
message (the parent action and the & are left implicit in the match syntax). Any 
other parent rule remains unchanged. Hence, the above definition behaves as the 
following definition: 

class logged— buffer-bis = 

get{r) & Some{a) c> r.reply{a) & z.EmptyQ 

or put{a,r) & EmptyQ \> r.reply{) & z.Some{a) & out .print-int{a) 

Yet another kind of debugging information is a log of put attempts: 

class logged-buffer-ter = self(z) 
match buffer with 
put{a,r) Parent-put{a,r) t> 0 

end 

or put{a,r) > out.print-int{a) & z.Parent-put{a,r) 

In this case, the match construct performs a renaming of put into Parent— put 
in the patterns of selected rules, without affecting their guarded processes. The 
net effect is similar to parent method overriding, with the new put calling the 
parent one and a late-binding semantics for guarded processes. 

The examples above illustrate that the very idea of class refinement is less 
abstract in a concurrent setting than in a sequential one. In the first logged— buffer 
example, logging the buffer state requires knowledge of how this state is encoded; 
otherwise, some states might be forgotten or logging might lead the buffer into 
deadlock. The other two examples expose another subtlety: in a sequential lan- 
guage, the distinction between logging put attempts and put successes is irrele- 
vant. Thinking in terms of sequential object invocations, one may be unaware of 
the concurrent behavior of the object, and thus write logged-buffer-ter instead 
of logged-buffer-bis. 

Syntax. The language with classes extends the core calculus of section 2; its 
grammar is given in Figure 2. Classes are taken up to the associative-commutative 
laws for disjunction or. We use two additional sets of identifiers for class names 
c G C and for sets of labels L G 2^ . Such sets L are used to represent abstract 
classes that declare the labels in L but do not define them. 

Join patterns J generalize the syntactic category of patterns M given in 
Figure 1 with an or operator that represents alternative synchronization patterns. 
Join patterns are taken up to simple equivalence laws: & and or are associative- 
commutative, and & distributes over or. Hence, every join pattern J can be 
written as a non-empty alternative of patterns or^g/ Mi, and the reaction rule 
(or^g/ Mi) [> P behaves as ofi^ffMi > P). 

Selection patterns K are either join patterns or the empty pattern 0. Their 
normal forms are of the form above, except that I can be empty. We always 
assume that patterns J and K meet the following well-formed conditions. Free 
names fn{K) are defined in the obvious way. As usual, l±l means disjoint union. 
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Fig. 2. Syntax for classes 



P : 




Processes 






(as in figure 1) 




obj X = self(2;) C init Pi in P2 


object definition 




class c = self(2;) C in P 


class definition 


C : 




Classes 




C 


class variable 




L 


abstract class 




J>P 


reaction rule 




Cl or C2 


disjunction 




self(a:) C 


self binding 




match C with S end 


selective refinement 


S : 




Refinement clauses 




(Al ^ A2 > P) 1 S' 


refinement sequence 




0 


empty refinement 


J : 


:= 


Join patterns 




l(u) 


message 




Ji & J2 


synchronization 




Ji or J2 


alternative 


K : 


— 


Selection patterns 




0 


empty pattern 




J 


join pattern 



1. All conjuncts Mi in the normal form of K are linear (as defined in sec- 

tion 2) and bind the same names. By extension, we say that K binds the 
names bound in every Mi, and write fn{K) for these names. 

2. In a refinement clause Ki K 2 > P, the pattern Ki is either M or 0, and 
the pattern K 2 binds at least the names of Ki (fn(Ki) C fn(K 2 )). 

As described in section 2, binders for object names include object definitions 
(binding the defined object to name x and self name z) and patterns (binding 
the received names). In a reaction rule J \> P, the join pattern J binds fn( J) with 
scope P. In a refinement clause K\ K 2 i> P, the selection pattern K\ binds 
fn{Ki) with scope K 2 and P; the modification pattern K 2 binds fn{K 2 )\fn{Ki) 
with scope P. Finally, the self renaming self (x) C binds the object name x with 
scope C. Class definitions class c = C in P are the only binders for class names c, 
with scope P. Processes, classes, and reaction rules are taken up to a-conversion. 

Labels don’t have scopes. Join patterns J declare the labels appearing in 
their message. Classes C declare the labels of their reaction rules. Abstract 
classes trivially declare their labels. Finally, selective refinements declare labels 
appearing either in the parent class or in a refinement clause. 
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Class expressions are simplified by means of a reduction semantics, that al- 
lows to obtain processes in the core calculus without classes. These reduction 
sementics (see the full paper [10]) has been designed to support separate com- 
pilation of classes. 



4 Inheritance Anomaly 

As remarked by many authors, the classical point of view on class abstraction — 
method names and signatures are known, method bodies are abstract — does 
not mix well with concurrency. More specifically, concurrency and class-based 
inheritance are not orthogonal. This well-known problem is often referred to 
as the inheritance anomaly. Unfortunately, inheritance anomaly is not defined 
formally, but by means of examples as in [15], where three patterns of inheritance 
anomaly are given. 

The examples in [15] demonstrate that extending a base class by new ca- 
pacities has an impact on the (desirable) concurrent behavior of the capacities 
that are inherited from the base class. Straightforward extensions to concur- 
rency of sequential languages, such as implementing synchronization in method 
bodies or providing simple locking policies (cf. synchronized from Java) prove 
unsufficient here. The former because synchronization code is not accessible, the 
latter because the provided synchronization policies are not expressive enough. 
[15] (partially) solve inheritance anomalies by making concrete some parts of the 
parent class (such as “concurrency-control” ) . It is to be noticed that they do so 
by considering a new extension for each anomaly. 

Our approach is different: starting from a concurrent language we are more 
concerned with the expressive power of our inheritance operators. Solving the 
three categories of inheritance anomaly, as we do, appears to be a valuable test. 

To this aim, we consider the same running example as Matsuoka and 
Yonezawa: a FIFO buffer with two methods put and get to store and retrieve 
items. We also adopt their taxonomy of inheritance anomaly: inheritance in- 
duces desirable modifications of “acceptable states” [of objects], and a solution 
is a way to express these modifications. 

We extend our programming language with booleans and integers, with usual 
primitive operations. Arrays are created by create(n), that gives an uninitialized 
array of size n. The size of array A is retrieved by A. size. Finally A[z]<— v is array 
A where the z-th item has been replaced with v. 

class buff = self {z) 

put{v,r) & {Empty{A, i, n) or Some{A, i, n)) > 
r. reply 0 & z. Check (A[(i+n) mod A. size] <— v, i, n-l-1) 
or get{r) & {Full{A, i, n) or Some{A, i, n)) > 

r.reply(A[i]) & z.Check(A, (z-l-l) mod A. size, n—1) 
or Check{A,i,n) > 

if n = A. size then z.Full{A, i, A. size) 
else if n = 0 then z.Fmpty{A, 0, 0) 
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else z.Some{A, i, n) 

or Init{size ) > z. Empty {create(size), 0,0) 

The state of the buffer is encoded as a message with label Empty, Some, or 
Full. The buffer may react to messages on put when non-full, and to messages on 
get when non-empty; this is expressed in a concise manner using the or operator 
in patterns. Once a request is accepted, the state of the buffer is recomputed by 
sending an internal message on Check. As Check appears alone in a join pattern, 
message sending on Check acts like a function call, which can be in-lined by an 
optimizing compiler. 

Partitioning of acceptable states. The class buff2 supplements buff with a new 
method get2 that atomically retrieves two items from the buffer. For simplicity, 
we assume here size > 2. 

Since get2 succeeds when the buffer contains two elements or more, the buffer 
state needs to be refined. Furthermore, since for instance, a successful get2 may 
disable get or enable put, the addition of get2 has an impact on the “acceptable 
states” of get and put, which are inherited from the parent buff. Therefore, label 
Some is no longer pertinent and is replaced with two labels One and Many. One 
models the state when the buffer holds exactly one item; Many defines a state 
with two items or more in the buffer. 

class buff2 = self(z) 

get2{r) & {Full{A,i,n) or Many{A, i, n)) > 
r.reply{A[i], A[(z-|-1) mod A. size]) 

& z.Check{A, (z-l-2) mod A. size, n—2) 
or match buff with 

Some{A, i, n) {One{A, i, n) or Many{A, i, n)) t> 0 end 
or Some{A, i, n) > 

if n > 1 then z. Many {A, i, n) else z.One{A, i, n) 

In the program above, a new method get2 is defined, with its own synchro- 
nization condition. The new reaction rule is cumulated with those of buff, using 
a selective refinement that substitutes “One{. . .) or Many{. . .)” for every occur- 
rence of “Somef . .)” in a join pattern. The refinement eliminates Some from 
any inherited pattern, but it does not affect occurrences of Some in inherited 
guarded processes: the parent code is handled abstractly, so it cannot be modi- 
fied. Instead, the new class provides an adapter rule that consumes any message 
on Some and issues a message on either One or Many, depending on the value 
of n. 

History-dependent acceptable states. The class gget—buff alters buff as follows: 
the new method gget returns one item from the buffer (like get), except that 
a request on gget can be served only immediately after serving a request on 
put. More precisely, a put transition enables gget, while get and gget transitions 
disable it. This condition is reflected in the code by introducing two labels After- 
Put and Not After Put. Then, messages on gget are synchronized with messages 
on AfterPut. 
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class gget-bujf = self (z) 
gget{r) & After PutQ 

& {Full{A, i, n) or Some{A, i, n)) t> 
r. reply {A[i]) & z.NotAfterPutQ 
& z.Check{A, (i+1) mod A. size, n—1) 
or match buff with 

Init{size) Init{size ) > Z.NotAfterPutQ 
I put{v, r) 

put{v, r) & {AfterPutQ or NotAfterPutQ) > 
z. After Put {) 

I getir) 

get{r) & {AfterPutQ or NotAfterPutQ) > 

Z.NotAfterPutQ 

end 

The first clause in the match construct refines initialization, which now also 
issues a message on NotAfterPut. The two other clauses refine the existing meth- 
ods put and get, which now consume any message on AfterPut or NotAfterPut 
and produce a message on AfterPut or NotAfterPut, respectively. 

Modification of acceptable states. A general-purpose lock may be defined as: 
class locker = self (z) 

suspend(r) & FreeQ > r.replyQ & z.LockedQ 
or resume{r) & LockedQ t> r.replyQ & z. FreeQ 

The class locker can be used to create locks, but it can also be combined 
with some other class c to control message processing on the labels of c. To this 
end, a simple disjunction of c and locker is not enough and some refinement of 
the parent c is required. 

class locked-buff = self (z) 
locker 

or match buff with 

Init{size) Init{size ) > z. FreeQ 
I 0 ^ FreeQ o z. FreeQ 

end 

The first clause in the match construct supplements the initialization of buff 
with an initial Free message for the lock. The second clause matches every other 
rule of bujf, and requires that the refined clause consume and produce a message 
on Free. (The semantics of clause selection follows the textual priority scheme of 
ML pattern-matching, where a clause applies to all reaction rules that are not 
selected by previous clauses, and where the empty selection pattern acts as a 
default case.) 

As a consequence of these changes, parent rules are blocked between a call to 
suspend and the next call to resume, and parent rules leave the state of the lock 
unchanged. In contrast with previous examples, the code above is quite general; 
it applies to any class following the same convention as bujf for initialization. 
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5 Related and Future Works 

There are many works on supplementing object calculi with concurrent primi- 
tives [16, 3, 11], or on supplementing process calculi with objects (usually by the 
mean of an encoding in the original calculus) [19, 18, 12, 5]. Our work follows 
the latter tradition. However, to our knowledge, it is the only one to address safe 
object composition in a process calculus. 

In [17], Odersky proposes an object-oriented extension of a language based 
on the join calculus. His proposal amounts to adding some record structure 
to join definitions, as we do in section 2. However, Odersky does not consider 
the problem of inheritance and refinement of synchronization. A small technical 
difference is that Odersky’s calculus is designed with pattern matching on values: 
values in Odersky’s calculus are not only object names but also constructed 
values (such as strings, integers, lists, etc.); then, the shape of values can be taken 
into account during synchronization. For instance, a rule of the form £{h :: t) > P 
is allowed, and will only fire when a message on £ carries a list that contains at 
least one cell. The extension of our calculus to structured values is easy (see [9]). 
However, we believe that synchronization should only concern names, and not 
depend on the shape of values itself. In particular, this allows a simpler semantics 
and an efficient compilation of synchronization into automata [14]. 

Since our type system abstracts from the shape of synchronization patterns in 
classes, it is blind to a number of relevant properties of concurrency, such as the 
presence of race conditions or deadlock freedom. The design of a sophisticated 
analyzer that is sensitive to synchronizations is a promising research direction. 

6 Conclusions 

We have proposed a simple, object-based variant of the join calculus. Every 
object is defined as a fixed set of reaction rules that describe its synchroniza- 
tion behavior. The expressiveness of the language is significantly increased by 
adding classes — a form of open definitions that can be incrementally assembled 
before object instantiation. Thereby, we partially recover the ability of the pi 
calculus to dynamically define the receptive behavior for messages. Our layered 
design confines this capability to classes. From a programming-language view- 
point, this seems a good compromise between flexibility and simplicity of the 
model. Indeed, our proposal still enables efficient compilation of synchronization 
and type inference. 
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Abstract. The problem we consider is motivated by allocating band- 
width slots to communication requests on a satellite channel under real 
time constraints. Accepted requests must be scheduled on non-inter- 
secting rectangles in the time/bandwidth Cartesian space with the goal 
of maximizing the benefit obtained from accepted requests. This prob- 
lem turns out to be equal to the maximization version of the well known 
Dynamic Storage Allocation problem when storage size is limited and 
requests must be accommodated within a prescribed time interval. 

We present constant approximation algorithms for the problem intro- 
duced in this paper using as a basic step the solution of a fractional 
Linear Programming formulation. 

This problem has been independently studied by Bar-Noy et al 
[BNBYF’^'OO] with different techniques. Our approach gives an improved 
approximation ratio for the problem. 



1 Introduction 

The problem we study in this paper has been encountered in the context of 
the EU research project Euromednet on scheduling requests for remote medical 
consulting on a shared satellite channel [Erne] . Every request asks for a number of 
contiguous bandwidth slots to provide every end user involved in the consulting 
with a TCP/IP satellite channel. Bandwidth is assigned in slots of 64 kb/sec. The 
number of slots per end user depends on the type of service desired (typical values 
are 64 kb/sec for common internet services - 384 Kb/sec for audio/video.) At 
most 48 slots of 64 Kb/sec are available on the channel in this specific application. 
Requests also specify a duration of the consulting (typical values are from 1/2 
hour to 2 hours) to be allocated within a time interval specified in the request. 
Requests are typically issued a few days in advance. The service manager will 
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reply soon with a positive or a negative answer on the basis of the pending 
requests and of the requests already accepted. Every accepted request is allocated 
starting from a base bandwidth for a contiguous number of slots along a time 
duration within the indicated time interval. The total bandwidth assigned to a 
single request must be contiguous due to the constraints imposed from FDMA 
(Frequency Division Multiple Access) technology. Other details regarding this 
specific application are available to [Erne]. 

The problem encountered in this application is a natural interesting combi- 
natorial problem: if we consider a Cartesian space with the bandwidth on the 
ordinate and the time on the abscissa then every accepted request is a rectan- 
gle of basis equal to the duration and height equal to the bandwidth requested. 
Accepted requests must observe the packing constraint, that is any two rectan- 
gles are non-overlapping. A benefit is also assigned to every request, modeling 
its relevance or the economic revenue from its acceptance. The objective is to 
maximize the overall benefit of accepted requests. In the sequel of the paper we 
denote this problem as the Rectangle Packing (RP) problem. 

This problem is related to a number of well studied combinatorial problems. 
Consider the machine scheduling problem with real time constraints in which 
every job asks to be scheduled for a given duration between a release time and 
a deadline. Only one job can be scheduled at any time on every single machine. 
A benefit is associated with every job with the goal of maximizing the benefit 
obtained from scheduled jobs. This is an old NP-hard scheduling problem [GJ79]. 
Very recently the first constant approximation algorithms has been proposed for 
this problem [BNGNS99] both on single and parallel machines. 

A second related problem is the Dynamic Storage Allocation problem (DSA) . 
DSA concerns the dynamic allocation of contiguous areas in a storage device. 
In DSA a set of requests for a contiguous area of memory along a specified time 
duration has to be accommodated while minimizing the storage space required. 
DSA is a classical problem in computer science [Knu73] whose study backs to the 
sixties. The rectangle packing problem is a maximization version of DSA where 
we have to allocate bandwidth rather than storage space. In RP storage space is 
limited. On the other end we insert real time constraints on the temporal alloca- 
tion of the process. We believe that this version of DSA is of relevance in many 
practical settings. DSA has been shown to be tightly related to interval graph 
coloring. Kierstead and Slusarek [Kie91, Slu89] proposed 3-approximation algo- 
rithms for aligned DSA, where the storage space required is a power of 2. More 
recently Gergov [Ger96] proposed a 5/2 approximation algorithm for aligned 
DSA that implies a 5 approximation for DSA and claimed a 3 approximation 
algorithm [Ger99]. 

A third closely related problem is the call control problem on linear networks 
[GGK97, BNGK+95]. This problem has been typically considered in its on-line 
version. At any step a request for establishing a connection between two vertices 
on the line network with a given bandwidth is presented. The algorithm has to 
accept or reject the request without knowledge of the requests presented in the 
future. If the request is accepted, a given benefit is obtained. The objective is 
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to maximize the obtained benefit without violating the bandwidth constraint 
on any link of the network. In the call control problem every request must be 
assigned on a fixed path in the line network, while in the RP problem some 
slackness in the time allocation may be allowed. However, the major difference 
from RP is that in call control we only require the bandwidth constraint, impos- 
ing that the overall bandwidth allocated on a link does not exceed the capacity 
of the link, rather than the stronger packing constraint of the RP problem. 

Results of the paper. We present a 12 approximation algorithm for the RP 
problem. As a basic step of the algorithm we solve a fractional LP problem in 
which we only enforce the bandwidth constraint and requests can be fraction- 
ally accepted. We then show with a novel rounding technique that the optimal 
fractional solution is a convex combination of a set of integral solutions with 
a specific property that we call stability, of which we select that with highest 
benefit. The selected solution may still contain intersecting rectangles. However 
it can be partitioned into three feasible solutions of which we select the best 
one as final solution of the algorithm. The approximation ratio we obtain is 6 if 
the bandwidth requested is a power of 2, 12 in the general case. The proposed 
solution runs in pseudopolynomial time. It can be transformed into a fully poly- 
nomial time algorithm at the expenses of a small increase in the approximation 
ratio. We also show a combinatorial algorithm with approximation ratio arbi- 
trarily close to 26 -I- e. This algorithm uses as a basic step the combinatorial 
algorithm devised in Bar-Noy et al. [BNBYF+00]. 

Independently from our paper, Bar-Noy et al. [BNBYF+00] proposed a 35 
approximation for our problem that they call Benefit DSA. Their approach is to 
solve a version of the problem where requests are either accepted or rejected in 
an integral sense, while the packing constraint is relaxed to the milder bandwidth 
constraint. A solution of this problem is then combined with an algorithm for 
the DSA problem. In a later version of their paper they improve the result to a 
67 — 1 combinatorial approximation and to a 67 — 3 LP-based approximation, 
where 7 is the approximation ratio for DSA. If we consider the 5-approximation 
for DSA of [Ger96] this yields respectively a 29 combinatorial and a 27 LP-based 
approximation for the problem. If we consider the 3-approximation claimed in 
[Ger99], this yields a 17 combinatorial and a 15 LP-based approximation for the 
problem. 

We finally show how to extend our algorithm to the multiple channel case 
for bandwidth allocation or, equivalently, to the multiple storage devices case in 
the DSA problem. 

Structure of the paper. In Section 2 we formally describe the RP problem. In 
Section 3 we describe the LP based approximation algorithm for the RP problem. 
In section 4 we show how the algorithm is turned into a fully polynomial time 
algorithm. In Section 5 we present a combinatorial version of the algorithm. 
Finally, in Section 6 we describe the extension to multiple channels. 
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2 The RP Problem 

Given an input set of n requests {< ri,di,bi,li,u!i >}”, where Vi, di,bi,li, tOi are 
integers, the generic request asks for a bandwidth interval of size bi in [0, B] 
along a continuous time interval of length li contained in [ri,di]. A request 
can be either accepted or rejected. A request that is accepted is scheduled on 
a bandwidth interval [f(i),f(i) + bi] and a time interval [t(i),t(i) + U] and a 
benefit LUi is accrued. An accepted request is represented with a rectangle of 
basis li and height bi on a Cartesian space having the time on the abscissa and 
the bandwidth on the ordinate. The schedule must observe the constraint that 
any two rectangles are non-overlapping. In the following we denote by packing 
constraint the constraint that two rectangles are non-overlapping. The packing 
constraint is stronger than the bandwidth constraint imposing that the overall 
bandwidth allocated at time t cannot exceed B. In the following we assume 
bi < 1 and S = 1. In the aligned version of the RP every bandwidth request is 
a power of 1/2. The objective of the algorithm is to maximise the overall profit 
obtained from accepted requests. 

3 A LP Based Approximation Algorithm 

We present an LP based approximation algorithm for RP. We first round all 
the bandwidth requests to the nearest higher power of 2. As a basic step of the 
algorithm we solve a fractional LP problem in which we only enforce the band- 
width constraint. We then show that the optimal solution to the fractional RP 
problem is a convex combination of a set of integral solutions holding a property 
that we will call stability. We select the best among these stable solutions that 
has benefit at least 1/2 the optimum to the LP problem. The selected solution 
can contain intersecting intervals since the packing constraint has not been im- 
posed. In the final step of the algorithm we show that the selected solution can 
be decomposed into three feasible solution of which we select that with highest 
benefit that will form the final solution to the problem. The obtained solution is 
a 6 approximation for the aligned version and a 12 approximation for the original 
problem. 

Thus the algorithm consists of three main steps: 

1. Solve the LP formulation; 

2. Find a stable solution; 

3. Obtain a feasible solution. 

3.1 The LP Formulation 

In this section we present the LP formulation we use as a basic step for the 
solution of the RP problem. 

We first round every bandwidth request to the lowest higher power of 1/2, 
namely h = mm ^ > bi}. 
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Variables xu, t = ri, di — h, are associated with request i. Variable xu is rang- 
ing in [0, 1], and denotes the schedule of request i with t{i) = t. Every request 
can be fractionally scheduled along a set of intervals for an overall value not 

di-h 

exceeding one, thus we impose ^ xa < I- Denote by T the latest deadline of 

t — Ti 

a request. In the LP formulation we also impose the bandwidth constraint at 
any time t G 1, T, namely that the overall bandwidth assigned to the requests 
fractionally scheduled at time t is at most 1. 



n di — li 

max EE 

i—1 t—ri 



E 



diXif ^ 1, V t 






di-h 

^ ^ Xit ^ I5 V Z 
t—ri 

Xit G [0, 1] , V t, z 

Xit — 0 ,Vz,t^ [7*2, dj 



k] 



We will denote with xu both a variable and its value. The optimum of the LP 
problem is related to the optimum of the RP problem by the following Lemma: 



Lemma 1. For the RP problem it holds OPT{LP) > OPT{RP)/2. For the 
aligned version of RP it holds OPT{LP) > OPT{RP). 



Proof: Consider a new formulation LPi obtained from LP by replacing in 
the bandwidth constraint the hiS with the original hiS. 



hiXit' < l,Vt, (1) 

and by imposing the integrality constraints xu G {0, 1}. Observe that any 
solution to RP is also a solution to LPi for which OPT{LPi) > OPT{RP). 
Since hi < 2bi, any solution to LPi with values ^ is also a solution to LP, with 
a benefit at least ^ of the benefit of LPi . Therefore, from a solution to LPi we 
obtain a solution to LP of value at least ^ of the value of the solution to LPi. 
Then OPT{LP) > OPT{LPi)/2 > OPT{RP)/2. 

For the aligned case, we simply obtain OPT{LP) > OPT(RP). ■ 



3.2 The Algorithm for Obtaining a Stable Solution 

In this section we present the LP based algorithm for the RP problem. We 
denote by z* the request z scheduled at time t. 
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Definition 1. Given a schedule of requests, the support at time t' , denoted 
by support{t') , is the maximum value such that there exists a set of j non- 
overlapping requests i\,i 2 , --Gj scheduled at time t' for which /(zi) = 0, /(zfe) = 
/(zfc-i) + k = 2, ..,j, f {ij) = support{t'). 

Request z* is {h,t') stable if h = supportft') = maxt" ^[t,tJrii)^upport{t") . 

A schedule of requests is stable if every request i in the schedule is (hi,ti) 
stable for some hi and U. 

The geometrical interpretation of a request z {h, f) stable is a rectangle placed 
on the top of a pile of non-overlapping rectangles of total bandwidth h (see Figure 
1). We will say that the rectangles in the pile form the support of z. 




t 



Figure 1. Request z is the filled rectangle in the figure. Request z is (h,t) stable. 
Observe that 2 requests in a stable schedule can overlap. 



Let Xit be the value of a variable in the optimal LP solution. Given an 
optimal LP solution we denote by a the largest rational such that xa is an 
integer multiple of a. 

The algorithm for obtaining a good stable solution first finds at most — 
integral stable solutions and then chooses that one with highest benefit. Denote 
by s the number of solutions constructed at a generic step of the algorithm. The 
algorithm is composed of the following steps: 

1. Order the non-zero xa by non increasing bi. 

2. Replicate ^ times request zL 

3. Assign every replication of z* to a solution with the following algorithm: 

(a) Select those solutions ..., Sm not containing a copy of z, out of the s 
constructed solutions. 

(b) Merge the m solution Sm of bandwidth 1 into a single 

solution S of bandwidth m.(The relative order of the solution is not 
relevant for the algorithm.) 

(c) Let the replication of z* be {h, t') stable in S. 

(d) If /z < m, then assign the copy of z* to solution with /(z*) = 

h mod 1; If h = m, then construct a new solution having z* assigned 
with /(z‘) = 0. 

4. Select the solution with highest benefit that we call Sbest- 

The simpler alternative would just place every z* in the first solution where it 
fits, i.e. where z* is {h,f) stable with h < 1 — bi, if any. However this alternative 
fails to place all the replications into at most 2/ a solutions as we are able to 
show for our algorithm. 
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Figure 2. The algorithm for obtaining a stable solution. 



Lemma 2. Every copy of i* is placed in a solution Sj G S, j < m, if s = 
solutions have been constructed. 

Proof: We prove that z* is {h, t') stable for a value h < m. Assume by contradic- 
tion h>m. At most 1/a distinct copies of z* need to be placed for every request 
i. Since 2/a solutions are available, at least m> 1/a -I- 1 solutions Si, Sm not 
containing a copy of i are available for a single z*. It follows that at time t' the 
whole bandwidth has been assigned on all the m solutions, namely for any Sj, 

E bi = l- 

From the packing constraint in LP, at time t' we have : a E ^ 1- 

It follows that at time t : 

1 > a > 1 -h a. 



thus a contradiction. 



a 
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We finally show that each integral solution satisfies the bandwidth constraint, 
namely for every copy of i*, /(i*) + 6^ < 1. We first give a preliminary Lemma: 

Lemma 3. For every request i and every solution S, f{i) = kbi for some integer 
k. 

Proof: The rectangles in the support of i are ordered by non increasing band- 
width. Since the bandwidths are powers of 1/2, we have that hi is a multiple 
integer of 5^. ■ 

Lemma 4. For any replication of any request i* , f{i*) +bi < 1. 

Proof: By definition of the algorithm each replication is scheduled at /(f*) if it 
is {f{i^),t') stable with /(f*) < 1. By the previous Lemma we have that 1 — /(i‘) 
is a multiple integer of bi for which the thesis follows. ■ 



3.3 Obtaining a Feasible Solution 

In this section we show how a stable solution, and in particular Sbest selected 
at the previous step of the algorithm, can be decomposed into three feasible 
solution of the RP problem. 

We first construct the intersection graph of Sbest by assigning a vertex to 
every rectangle and connecting with an edge every pair of vertices representing 
intersecting rectangles. We then show that the obtained intersection graph is 
3-colourable and that this can be done in polynomial time. By choosing the set 
of rectangles of same color having maximum benefit we obtain a collection of 
non-overlapping intervals in which every request is scheduled at most once. 

The algorithm is as follows: 

1. Construct the intersection graph of solution Sbest', 

2. Colour the intersection graph with three colours with the following algo- 
rithm: 

(i.) Consider the requests in order of non increasing bandwidth bi; 

(ii.) Colour the requests with same bi and /(z), in order of starting point, 
assigning greedily the 3 available colours. 

3. Accept those requests coloured with same colour having total highest benefit; 

4. Bring every rectangle’s height bi to the original original bi. 

In order to prove that the algorithm gives a legal 3-coloring of the graph we 
state a set of properties of a stable schedule. We first give a direct corollary of 
Lemma 3. 

Corollary 1. Consider two requests i and j with bi > bj, respectively {hi,ti) 
and (hj,tj) stable. Request i intersects with request j only if hi < hj < hi+bi. 

The following Lemma is used to prove the two following Lemmas. 




Approximation Algorithms for Bandwidth and Storage Allocation Problems 417 



Lemma 5. Consider a schedule with two intersecting requests i and j that 
are respectively {hi,ti) and (hj,tj) stable. Therefore U,tj ^ + h) H 

[t{j)Aj) + h) 

Proof: The proof is by contradiction. Assume request i placed before j. If ti G 
[t{i), t{i) + li) n [t{j), t{j) + Ij) then i is part of the support of j, hj > hi + bi, a 
contradiction since the two requests are intersecting. 

Assume tj G [t{i), t{i) + If) n [t{j), t{j) + lj). It must be hj > hi, otherwise by 
Lemma 3, hj + bj < hi, a contradiction to the intersection of the two rectangles. 
Since we are considering the aligned case, at least one rectangle of the support 
of j in tj, say h, will be scheduled between hi — bh and hi. Therefore, i is part 
of the support of j, a contradiction. ■ 

The next Lemma states that if i and j are intersecting the two associated 
time intervals are not nested. 

Lemma 6. For any two intersecting requests i, j, it never holds [t{i),t{i) + li) C 

Proof: The proof is by contradiction. If i and j are overlapping and t{i) + 
h) ff [l{j)C{j) + ^j) then for the support of i it holds ti G [t{i),t{i) + If) n 
[t{j),t{j) + Ij), a contradiction to Lemma 5. ■ 

Lemma 7. The maximum clique size of the intersection graph is 2. 

Proof: Assume by contradiction that requests i, j and k form a clique of size 
3 and that k is assigned to a solution after i and j. Assume i {hi,tf) stable, 
j (hj,tj) stable and ti < tj. Request k must be completely contained in the 
interval (ti,tj), otherwise k is either {hi + bi,tf) stable or {hj + bj,tj) stable, 
thus it does not intersect with i or j. 

Therefore [t{k),t{k) + h) is completely contained in {ti,tj) that leads to the 
fact that either tk G {t{i),t{i) + If) or tk G {t{i),t{i) + If). By Lemma 5 this is a 
contradiction to the assumption that k intersects both i and j. ■ 

We finally prove that the algorithm produces a legal 3 colouring of the inter- 
section graph. 

Theorem 1. The algorithm colours the intersection graph with 3 colours. 

Proof: By Corollary 1, requests with same bi and different f{i) are non inter- 
secting. Therefore they can be coloured independently. Concentrate on a set of 
requests with same bi and f{i). They are coloured greedily in order of starting 
point, i.e. from left to right. 

Consider one such request i. By Lemma 6, every request intersecting i can 
intersect either t{i) or t{i) + k — 1, but not both endpoints. If more than I 
request intersect an endpoint i, by Corollary 1, these all intersect in that point 
thus creating a clique of size at least 3, by Lemma 7 a contradiction. Therefore 
at most 1 request intersects each endpoint of i, at most 2 requests intersecting 
i are already coloured, leaving one colour available for i. ■ 

We finally show the approximation ratio we obtain. 
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Theorem 2. The algorithm for the RP problem is 12- approximated in the gen- 
eral case ami 6-approxirnated in the aligned case. 

Proof: The algorithm selects a solution whose beneht is at least 

OPT{LP)/2 as it follows from: OPT(LP) = ^ 

CtUJi < 2 £SBsst 

By Lemma 1 OPT(RP) > OPT(LP)/2 in the general case for which the 
benefit of S'besi is at least 1/4 of OPT(RP)^ while in the aligned case we have 
OPT(RP) > 0PT(LP)/2. Moreover we colour the requests of Siest with 3 
colours and select the set of intervals with same colour of highest beneht, for 
which the hnal solution has a beneht of at least 1/3 of the beneht of S'^est- 
Altogether we obtain an approximation ratio of 12 for the general case and of 6 
for the aligned case. ■ 




Figure 3. Requests are coloured by non-increasing bandwidth size. 



4 A Fully Polynomial Algorithm 

The number of constraints in the LP formulation is 0(nT), thus leading to a 
pseudopolynomial algorithm. Bar-Noy et al. [BNGNS99] showed how to reduce 
the number of time slots to a polynomial in n in a LP formulation for the 
maximum throughput scheduling problem under real-time constraints. The ap- 
plication of their technique to our case allows to express the LP solution as a 
convex combination of 3 /q: rather than 2/ a integral solutions, therefore leading 
to an integral solution with a beneht at least 1/3 of the optimum LP. This re- 
sults in a higher approximation ratio of 18 for the general version and of 9 for 
the aligned version. 

5 A Combinatorial Algorithm 

In this section we sketch how to replace the basic step of the approximation algo- 
rithm based on the solution of a fractional LP formulation with a combinatorial 
algorithm that delivers a constant approximation solution to the LP problem. 
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We partition the requests into wide requests, that ask at least 1/2 of the avail- 
able bandwidth, and narrow requests whose bandwidth requirement is less than 
1/2. We solve the RP problem separately for wide requests and narrow requests 
and we choose the best solution. If all requests are wide then RP is equivalent 
to interval scheduling for which a 2 approximation algorithm is known [Spi99]. 

For narrow requests we replace the basic step of the algorithm based on 
solving the LP formulation with a combinatorial algorithm. We divide every 
request in k identical requests each one with a fraction 1/k of the bandwidth and 
of the profit of the original request. We then apply the combinatorial algorithm 
by [BNBYF+OO] for finding an approximate integral solution to the problem in 
which the only bandwidth constraint is imposed. Lemma 3.2 of [BNBYF+00] 
states the following: 

Lemma 8. For each integer k there exists a combinatorial algorithm that finds 
a2 + l/k approximate solution to the LP formulation. 

Therefore the combinatorial algorithm gives a solution that is away form 
the optimal LP solution for at most a 2 ^ factor thus leading to a 12(2 -|- i) 

approximate solution for narrow requests. Combined with the 2 approximation 
for wide requests we obtain: 

Theorem 3. For every k there exists a 2(S + 1/k combinatorial approximation 
algorithm for the RP problem. 

6 The Multiple Channel Case 

In this section we assume that m channels, each one with a bandwidth Bj < 1, 
are available. For the sake of simplicity we assume the Bfis to be powers of 1/2. 
We briefly sketch the extension of known techniques [BNGNS99], to obtain a 
c+ 1 throughput maximization approximation algorithm for m parallel unrelated 
machines provided a c algorithm for a single machine. We consider a Linear 
Programming formulation with variables Xijt indicating the allocation of request 
i at time t on machine j. We set xijt = 0 for those machines j where hi > Bj. 
We then solve the LP problem and apply our rounding algorithm in order from 
channel 1 to channel m while we disregard on channel j requests already accepted 
on a previous channel. We then conclude with the following theorem: 

Theorem 4. Provided a c approximation algorithm for the RP problem on a 
single channel, there exists a c-l- 1 approximation algorithm for the RP problem 
on multiple channels. 

7 Conclusions 

In this paper we present constant approximation algorithms for the RP problem, 
a throughput version of bandwidth and storage allocation problems when real 
time constraints are imposed. Our algorithm uses as a basis a solution of a Linear 
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Programming formulation and partitions it into a convex combination of integral 
solutions with a novel rounding technique. We improve the approximation results 
found independently from our work in [BNBYF“'"00]. 

An interesting open problem is to study the problem in the on-line model in 
which requests for bandwidth allocation are presented over time. An interesting 
model is also to not allow rejection of the requests if enough bandwidth is avail- 
able. We finally mention the improvement of the approximability of the problem, 
in particular by exploiting some of the ideas behind the recent approximation 
for DSA of [Ger96, Ger99] 
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Abstract. For the classic dynamic storage/spectrum allocation prob- 
lem, we show that knowledge of the durations of the requests is of no 
great use to an online algorithm in the worst case. This answers an open 
question posed by Naor, Orda, and Petruschka [9]. More precisely, we 
show that the competitive ratio of every randomized algorithm against 
an oblivious adversary is where x may be any of several dif- 

ferent parameters used in the literature. It is known that First Fit, which 
does not require knowledge of the durations of the task, is logarithmically 
competitive in these parameters. 



1 Introduction 

The dynamic storage/spectrum allocation (DSA) problem is a classic combina- 
torial optimization problem in the computer science literature (for surveys see 
[2] or [13]). 

Dynamic Storage/ Spectrum Allocation Problem Statement: An online 
algorithm is equipped with a linear resource, for example memory or radio spec- 
trum, that it must use efficiently to satisfy a sequence of n requests for this 
resource. The fth request Ri is a pair (si,di) that is revealed to the online al- 
gorithm at some release time r^. The parameter Si denotes the bandwidth of 
the request Ri, and di denotes the duration of Ri. We number the requests by 
increasing order of release times, that is, ri < rj for i < j. In response to request 
Ri , the online algorithm must allocate Si units of contiguous resource to Ri 
during the time interval [r^, -|- di]. We say that Ri is active during [r^, Vi + di]. 

Importantly, at no time may any two active requests share a common unit of 
resource. The objective function is to minimize the total resource size required 
to satisfy all of the given requests. 

* Supported in part by NSF grant CCR-9734927 and by ASOSR grant F49620010011. 
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In essentially all programming systems, dynamic memory managers do not, 
and more generally can not, know the exact duration of dynamically allocated 
objects. Hence, if the linear resource is memory, the standard assumption is that 
the online scheduler does not learn di at time [13]. However, in many wireless 
data transmission applications, where the linear resource is frequency spectrum, 
the online scheduler does learn di at time r^. For example in the testing of 
aircraft, di is known because the test is predetermined [6]. More generally, the 
value of di will be known if the information to be transmitted is known a priori. 
In this paper, we consider the online version of DSA with known durations. 

The standard measure used to compare online algorithms is competitive- 
ness [1]. In the context of DSA, an online randomized algorithm A is c- competitive 
if for all input sequences I, the expected value of the bandwidth A{I) used by 
A on input / is at most c times Opt(/), the optimal bandwidth to satisfy I [1]. 

Most of the literature on online dynamic storage allocation deals with the 
unknown duration case, that is when the online algorithm does not learn the 
duration of a request Ri until Ri leaves the system at time Vi + di. There is 
no constant competitive algorithm in the unknown duration case [10,11]. Ev- 
ery deterministic algorithm is f?(log/3) competitive, where (3 is the ratio of the 
largest bandwidth of a request to the shortest bandwidth of a request [10]. Every 
randomized algorithm is I7(min(j^^j^j^, competitive against an obliv- 

ious adversary [8], here a is the maximum number of active requests at any one 
time. An oblivious adversary must specify all of the requests before the online 
algorithm begins execution [1]. The competitive ratio of First Fit, which places 
each request in the lowest feasible location in the linear resource, was observed 
to have a competitive ratio of 6>(log(2-|-a/3/M) in [4], where M is the maximum 
amount of occupied linear resource. 

In [9], the dynamic storage allocation problem with known durations is con- 
sidered. They show that in the case of known durations that an online algorithm 
may be logarithmically competitive in other parameters besides a and (3. More 
precisely, an online algorithm may be 0(logZ\) competitive, here A is the ratio 
of the largest duration of a request to the smallest duration of a request, and 
may be O(logr) competitive, here t is the number of distinct durations. In [9] 
the fundamental problem of determining whether knowledge of the durations of 
the requests is of significant benefit to the online algorithm, i.e. whether it is 
possible for an online algorithm to be constant competitive, is left open. Here 
we prove the following theorem. 

Theorem 1. Assuming an oblivious adversary, the competitive ratio of every 
randomized online algorithm for the dynamic storage allocation with known du- 
rations is l^( io*gfoga; )? where x may be any one of n, a, (3, r or log A. 

We feel that the proper interpretation of these results is that knowledge of 
the durations does not greatly benefit the online allocator in the worst case. 
Although these results do leave open the possibility that knowing the durations 
may logarithmically improve the competitive ratio achievable by an online algo- 
rithm when the competitive ratio is measured in terms of A. 
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The offline version of DSA, where the algorithm has complete knowledge 
of future requests, is NP-hard (see [3], which credits this result to a personal 
communication from Stockmeyer), and a 3-approximation polynomial time al- 
gorithm is known [5]. Several average case analyses of various algorithms are 
known, for more information see [13]. Some early papers on DSA are [7,12,14]. 

For convenience, we will refer to the linear resource as spectrum for the rest 
of the paper. 

2 The Lower Bound Construction 

In this section we prove Theorem 1 using the following statement of Yao’s prin- 
ciple for online cost minimization problems [1]. 

Theorem 2. Suppose that there exists a input distribution on the inputs I such 
that for all deterministic algorithms A it is the case that 

liminf > c 

n^oo E [0pt(/)] 

and 

limsupif [Opt(/)] = oo 

n — »-oo 

where the expectation is over the marginal distribution on inputs of size n, Then 
the competitive ratio of every randomized algorithm is at least c. 

In order to apply Theorem 2, fix a deterministic online algorithm A, and fix 
some c > 2. Let W = 12(12c)^’^°. Note that c = ^ 

request distribution that forces E[A{I)] = 0{cW) while if[OPT(/)] = 0(W). 

Request Distribution: We partition the requests into 24c rounds. The band- 
width of each request in round z, 1 < i < 24c, will be w{i) = (12c)^®“^. Further, 
we partition round z, 1 < z < 24c, into ^ stages, where 6(z) = (12c)^*“^. The 

number £ij of requests in stage j, I < j < of round z, will be selected 
uniformly at random over the range [1, Note that we have set the value 

of W so that > 12 for any 1 < z < 24c. 

We associate an interval lij = [aij,bij] with the stage j of round z. Initially, 
h,i = [ai,i, &i,i] = [0, 1]. Each request in stage j of round i is released at time 

The duration of the /cth request, 1 < fc < iij, in stage j of round z is then 
{bij — — 1/2^). Notice that the durations increase throughout a stage. 

We call the last request in each stage a fang. 

Finally, the interval associated with the next stage (whether it is in the same 
round or a different round is irrelevant) is [oij + {bij — aij){l — 1/2^*’Z“^), Oij + 
{bij — aij){l — 1/2^*'Z)]. That is, the interval for the next stage is between the 
end of the penultimate request in the previous stage and the end of the fang 
in the previous stage. Thus all fangs released during a stage will stay present 
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till the arrival of the last request of the last round, while requests that are not 
fangs leave the system before requests in the next stage appear. For a graphical 
depiction of the construction see Figure 1. 



Stage j+1 interval 



I 



Speetrum 




Fang 



Stage j interval 



. Time 



a. . 
ij 



b . . 
ij 



Fig. 1. Lower Bound Construction 



Intuitively, the adversary may allocate fangs consecutively in the spectrum. 
However, since the online algorithm can not identify a fang when it arrives, it 
will not be able to allocate fangs to one part of the spectrum, and thus will have 
a more fragmented resource when it proceeds to the next stage. We will first 
prove that Opt = 0(W), and then show that the expected bandwidth used by 
A is f2{cW). 

Lemma 1. With probability one, Opt(/) < 4W 

Proof. One possible strategy to achieve 4W is to divide the spectrum into two 
pieces N and F, each of size 2W. All non-fangs are allocated using to N using 
First Fit, and and all fangs are allocated to F using First Fit. 
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The total bandwidth of the non-fang requests in a stage j in round i is at 
most the product of the bandwidth w{i) of the requests in round i, and the 
number of requests 2j^^. Since j is at most W/b{i), the total bandwidth of the 
non-fang request in stage j of round i is at most 



j{i) ■ 2 



W 

b(i) 



b(i) 

w{i) 



2W 



Hence, all the non-fangs within any stage can be fit into N. Observe that any 
two non-fang requests not from the same stage do not overlap in time. Therefore, 
the same spectrum N can be reused for all of the stages. 

We now calculate required bandwidth to satisfy fang requests. For round i, 
there are fang requests of width w(i) each. Therefore, the total bandwidth 
of the fangs is at most 



24c 



E 



w 



24c 



(•)=i: 

1=1 



w 



= 2W 



Hence, all the fangs can fit into F. ■ 

It order to analyze the performance of A it is convenient to grant A additional 
powers. At the end of each round, say round i, we allow A to reorganize its 
spectrum in the following manner. Assume that there are two fangs Rj and Rk 
such that the space between Rj and Rk is less than w{i + 1), the bandwidths 
of the requests in the next round. Notice that the space used by Rj and Rk, 
and the space between Rj and Rk, is of no use to A in future rounds. The 
online algorithm A may then delete the space used by Rj, the space used by 
Rk, and the space between these two requests from its spectrum, and meld the 
remaining two pieces of spectrum. We say that the remaining contiguous portion 
of the melded spectrum is available spectrum for the next round. Note that this 
is strictly to H’s benefit since any feasible assignment on the original spectrum 
is also a feasible assignment in the modified spectrum. 

We now wish to define what we call a fragmented round. Intuitively, a frag- 
mented round is one in which A assigns many fangs to different parts of the 
available spectrum. Let i be some round. We partition H’s available spectrum 
into contiguous i-blocks Bi^k of size b{i). We say that a fang from round i is 
assigned to an t-block Bi k if either the fang is contained entirely within Bi^k or 
the fang crosses the boundary between Bi^k and Bi^k+i - Let f{i) be the number 
distinct z-blocks that have at least one fang assigned to them. We say that a 
round i is fragmented if f{i) > 

We will now argue that: 

— the expected number of fragmented rounds is at least one half of the total 
number of rounds, and 

— the available spectrum decreases by 0(W) after each fragmented round pro- 
vided that A does not use too much spectrum. 
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Lemma 2. The expected number of fragmented rounds is at least 12c. 

Proof. Consider stage j of round i. Let fj{i) be the number of distinct i-blocks 
assigned at least one fang from stages 1 through j in round i. By our construction 
and the definition of fang requests, we have fj-i{i) < j — 1. Therefore, at most 
{j — 1);^^ of the requests in stage j of round i can occupy z-blocks that have 
already been assigned a fang in round i. Since the number of requests in stage 
j of round i is uniformly at random selected from the range the 

probability that the fang of stage j of round i is assigned to an i-block that does 
not already contain a fang is at least 1/2. Hence, the probability that half of the 
fangs in round i are assigned to an z-block that did not previously contain a fang 
from round z is at least The result then follows since there are 24c rounds. ■ 

Lemma 3. During every fragmented round, either A deleted at least ^ units 
of spectrum at the end of this round, or A used at least cW units of spectrum 
during this round. 

Proof. Fix a fragmented round z, and consider the assignment of the fangs from 
round z. Within each z-block Bi^k that contains a fang from round z, arbitrary 
pick a canonical fang from among those fangs assigned to Bi^k during round z. 
Recall that there are at least 2 ^^ canonical fangs since z is a fragmented round. 
We sort these canonical fangs according to their position in H’s spectrum, and 
group them into groups, each containing 3 consecutive canonical fangs. 

We say that a group is sparse if the gap between any two consecutive canon- 
ical fangs in the group is at least w{i + 1). Recall that we set W such that 
> 1 for all 1 < z < 24c. First, suppose that of the groups are sparse. 
Then the total spectrum used by A must be at least j^^w(z -I- 1), which by 
substitution is cW 

On the other hand, suppose that of the groups are not sparse. Hence, 
the spectrum used by the canonical fangs in this group, as well as the spectrum 
in the gaps between these fangs, will be deleted by A at the end of the round. 
Since the gap between first and third requests of each group is at least b{i), the 
total spectrum deleted by A during this round is at least ■ 

We are now ready to prove Theorem 1. 

Proof. First, the fact that Opt = 0{W) follows from lemma 1. By lemma 2 
and lemma 3, the expected spectrum used by A is at least cW. Note that the 
expected spectrum used by Opt goes to 00 as n goes to 00 . Hence, we get a 
lower bound of competitive ratio. 

Observe that we get a lower bound of on the competitive ratio 

since (3 <W. 

Observe that n is no smaller than a and t. So in order to show a lower bound 
of f^( iog°j/ga; ) on the competitive ratio, for x equal to n, a, and r, it is sufficient 
to show that n = 0{W^). Since the number of stages in round z is and the 
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number of requests in stage j of round i is at most 2j^^, it follows that the 
number requests in round i is at most 



w 

6(*) ^ 
w(i) 


j 






b{i) W 


/ W 


w(i) b(i) 




< 1 




w(i) b{i) 


\ Ht). 



2W^ 

w(i) • b(i) 



Since b{i)w{i) = (12c)^*“^ > 12c > 4, the number of requests per round is going 
down by more than a factor of two each round. Hence, the total number of 
requests is at most 



^ 2W^ 



< 2W^ 



Finally, that the competitive ratio is l^( io*gio^giog/\ ) 
Z\ = 2" by construction. 



follows from the fact that 
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Abstract. We study models of the untyped lambda calculus in the set- 
ting of game semantics. In particular, we show that, in the category of 
games Q, introduced by Abramsky, Jagadeesan and Malacaria, all cat- 
egorical A-models can be partitioned in three disjoint classes, and each 
model in a class induces the same theory {i.e. the set of equations be- 
tween terms), that are the theory Ti* , the theory which identifies two 
terms iff they have the same Bohm tree and the theory which identifies 
all the terms which have the same Levy-Longo tree. 



Introduction 

In this work we explore the methodology for giving denotational semantics based 
on games, introduced by Abramsky - Jagadeesan - Malacaria, Hyland - Ong and 
Nickau (see [AJM96, HOOO, Nic94]). We use game semantics to build models of 
the untyped A-calculus, focusing on which A-theories can be modeled. A-theories 
are congruences over A-terms, which extend pure /3-conversion. Their interest 
lies in the fact that they correspond to the possible operational (observational) 
semantics of the A-calculus. Brute force, purely syntactical techniques are usu- 
ally extremely difficult to use in the study of A-theories. Therefore, since the 
seminal work of Dana Scott on Dao in 1969 [Sco72], semantical tools have been 
extensively investigated [HR92, HL95, Ber]. 

This paper is the completion of the work initiated in [DGFH99] and gives a 
complete characterization of the theories induced by general game models of the 
A-calculus. In [DGFH99] we considered just models which validate the 77 -rule. In 
order to obtain our new results, new proof techniques have been introduced. 

We show that the theory induced by each categorical model of the A-calculus 
in the Gartesian closed category K\(Q) of games and history-free strategies is 
either: the theory Ti* (the maximal sensible theory), the theory 05 which equates 
two terms if and only if they have the same Bohm tree or the theory C which 
equates two terms if and only if they have the same Levy-Longo tree. 

This result suggests that there exists a strong connection between a strategy 
which interprets a term in the game semantics setting and the tree form of the 
term. The existence of relations between strategies and some syntactical normal 
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form of terms (like trees) is described in many works on game semantics. In all 
these works, a relation is established to study the fine structure of a particular 
game model and prove that is fully abstract. In our work we give a somewhat 
stronger result and prove that such a relation exists for all the game models of 
the A-calculus. 

Works related to ours, in the slight different game semantics paradigm of 
Hyland and Ong, are [KN099, KNOOO]. There, two particular game models for 
the A-calculus are built and it is proved, using techniques quite different from 
ours, that the two models induce respectively the theories 7i* and ®. 

The present paper is organized as follows. In section I we remind the basic 
definitions of game semantics and introduce the extensions and K\{Q‘^) of the 
classical categories of games Q and K\{Q). In Section 2 we introduce the main 
tool for the study of the fine structure of the models built in the categories of 
Section 1, that is the approximating strategies whose intended meaning is to give 
a finite approximation of the interpretation of a A-term. Section 3 is devoted 
to the study of the models previously introduced and to the proof of the main 
theorem of this work, i.e. the characterization of all the A-theories induced by 
the models of the untyped A-calculus in the category K<{Q). 

We assume the reader familiar with the basic notions and definitions of 
A-calculus, see e.g. [Bar84]. For lack of space, all the proofs have been omit- 
ted. A complete version of this paper is [DGF]. 

Acknowledgements. We wish to thank Fabio Alessi, Corrado Bohm, Furio 
Honsell and Luke Ong for useful discussions during the period of which this 
work generated. We also thank the anonymous referees whose comments have 
contributed to improve this work. 

1 Categories of Games 

Throughout this paper, we shall make use of the well-known category Q of games 
and history-free strategies and its Cartesian closed companion K\{Q), and of 
the categories 5® of games and history-sensitive strategies and K<{Q^) that are 
new. 5® is a straightforward extension (super-category) of Q and it has been 
introduced for technical reasons. We briefly remind the basic definitions of game 
semantics [AJM96] and introduce the new categories we shall utilize. 

Definition 1 (Games). A game has two participants: the Player and the Op- 
ponent. A game A is a quadruple (Ma,Xa,Pa,~a) where 

— Ma is the set of moves of the game. 

— Aa : Ma {0,P} x {Q,A} is the labeling function: it tells us if a move is 

made by the Opponent or by the Player, and if it is a Question or an Answer. 
We can decompose A^i into A^^ : Ma {0,P} and : Ma {Q,A} 
and put Xa = denote by ~ the function which exchanges 

Player and Opponent, i.e. O = P and P = O. We also denote with X^^ 
the function defined by X^^(a) = X^^(a). Finally, we denote with Aa the 
function {X^^,X'^^). 
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— Pa is a non-empty and prefix-closed subset of the set M® (which will be 

written as Pa M® ), where M® is the set of all sequences of moves 

which satisfy the following conditions: 

• s = at ^ \A{a) = OQ 

• Vi : 1 < f < |s| . A^^(s*+i) = A^^(sj) 

• V t E s . \t\M^\ < \t\M^\ 

where and denote the subsets of game moves labeled respectively as 
Answers and as Questions, s \M denotes the set of moves of M which appear 
in s and E is the substring relation. Pa is called the set of positions of the 
game A. Questions and answers behave in a parenthesis-like fashion, that is 
each question waits for a corresponding answer to appear and each answer 
in a sequence corresponds to the last pending question in that sequence. 

— ~A is an equivalence relation on Pa which satisfies the following properties: 

• s^as' ^ |s| = |s'| 

• sa ^A s' a' ^ s ^A s' 

• s s' A sa € Pa 3a' . sa s' a' 

In the above s, s' , t and t' range over sequences of moves, while a, a' , b and b' 
range over moves. The empty sequence is written e. 



Definition 2 (Tensor product). Given games A and B, the tensor product 
A® B is the game defined as follows: 

— Ma0B = Ma + Mb 

— ^A^B = [Aa, As] 

~ Pai»b Q M®^g is the set of positions, s, which satisfy the following: 

• the projections on each component (written as s\A or s\B) are positions 
for the games A and B respectively; 

• every answer in s must be in the same component game as the corre- 
sponding question. 

— s k.a^b s' s I" ^ «A s' f A, s f i? s' \B,\/i . Si G Ma s' G Ma 

Here + denotes disjoint union of sets, that is A-\-B = {zn;(a) | a G A}U{inr{b) \ 
b G B}, and , — ] is the usual (unique) decomposition of a function defined on 
disjoint unions. 

It is easy to see that in such a game only the Opponent can switch component. 

Definition 3 (Unit). The unit element for the tensor product is given by the 
empty game I = (0, 0, {e}, {(e, e)}). 



Definition 4 (Linear implication). Given games A and B, the compound 
game A —o B is defined as the tensor product but for the condition Xa^b = 

[Aa, Ab]. 

It is easy to see that in such a game only the Player can switch component. 
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Definition 5 (Exponential). Given a game A, the game \A is defined by: 

— M\a = U! X Ma = 

— A<A((i,a)) = \A(a) 

— P\A Q Af,® is the set of positions, s, which satisfy the following: 

• Vi Guj . s\{i,A) e P{i,A); 

• every answer in s is in the same index as the corresponding question. 

— s s' <G> 3 a permutation of indexes a G S{uj) such that: 

• 7tJ‘(s) = a*(7T*(s')) 

• Wi G uj . 7T2 (s I" a{i)) « tt^ ( s f i) 

where tti and 7T2 are the projections of co x Ma, tt* and TrJ are the (unique) 
extensions of tti and tt 2 to sequences of moves and s\i is an abbreviation of 
s\{i,A). 

Definition 6 (Strategies). A strategy for the Player in a game A is a non- 
empty set a C positions of even length such that ct = ct U dom{a) is 

prefix-closed, where dom{a) = {t G P'a'^ \ 3!a . ta G a}, and P'a'^ and 
denote the sets of positions of odd and even length respectively. A strategy can 
be seen as a set of rules which tells (in some position) the Player which move to 
make after the last move by the Opponent. 

The equivalence relation on positions can be extended to strategies in the 
following way. 

Definition 7 (Equivalence of strategies). Let a,r be strategies, a ^ t if 
and only if 

— sab G a, s'a'b' G t, sa s' a' ^ sab s'a'b' 

— s G a, s' G T, sa «a s' a' => ((36 . sab G a) (36' . s'a'b' G t)) 

Such an extension is not in general an equivalence relation since it might lack 
reflexivity. If cr is a strategy for a game A such that cr « cr, we write cr : A and 
denote with [cr] the equivalence class containing a. 

Definition 8 (History- free strategies). A strategy a for a game A is history- 
free if it satisfies the following properties: 

— sab, tac G a ^ b = c 

— sab, t G a, ta G Pa tab G a 

Definition 9 (The category of games Q). The category Q has as objects 
games and as morphisms, between games A and B, the equivalence class, w.r.t. 
the relation ^a^b, of the history-free strategies for the game A —o B. The 
identity, for each game A, is given by the (equivalence class) of the copy-cat 
strategy Ma = {s G Pa'^A" | Vt G s . even(|6|) ^ t \ A' = t \ A"} where 
even(— ) is the obvious predicate and the superscripts are introduced to distinguish 
between the two different occurrences of the game A. Composition is given by the 
extension on equivalence classes of the following composition of strategies. Given 
strategies a : A —o B and t : B —o C, t o a : A —o C is defined by 

Toa={s\{A,C)\sG{MA + MB+Mc)* A s\{A,B) Ga,s\{B,C) GtY"^" 
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It is not difficult to check that the above definitions are well posed and that 
the constructions introduced in Definitions 2, 4 and 5 can be made functorial. 
Notice that there is a natural isomorphism in the category of sets between {Ma + 
Mb) + Me and Ma + {Mb + Me) which induces a natural transformation 
A\ B c ■ hom(^ ® B,C) \\ova{A,B C) in Q, that is the category Q is 
monoidal closed. If we define, for each pair of games B and C of Q, the strategy 
evB,c the set {s G P((a'^b')i»A")^b" | Vt C s . even(|t|) ^ t\A' = t\A" 
k t \ B' = t \ B”} we have, for each strategy a : A® B C, the identity 
[a] = [eug q] o {A\ b c([^D ® [*c?b])- However Q is not Cartesian. 

Definition 10 (The Cartesian closed category of games Ki{Q)). The cat- 
egory K\{Q) is the category obtained by taking the co-Kleisli category over Q 
over the co-monad (!,der, (5) [AJM96], where, for each game A, the (history- 
free) strategies der^i : \A —o A and Sa ■ ^-A —o llA are defined as follows: 
der^ = {s G P\a^a | Vt C s . even(|t|) t f (0, H) =t\A} 

(5a = {s G P\A^ !!T I Vt C s . even(it|) ^ t \ {p{i,j),A) = t \ {j, {i,A})} 

where p:NxN^Nisa pairing function. By the above definition the category 
K\{Q) has as objects games and as morphisms between games A and B the 
equivalence classes of the history-free strategies for the game lA -o B. Moreover, 
K\{Q) is Cartesian. 

Definition 11 (Cartesian product). The Cartesian product A x B of two 

games A and B is defined by: 

Maxb = Ma-\-Mb Aaxb = [Aa,Ab] 

Haxb = Pa + Pb ^AxB = + ~B 

The projection morphism tTa’^ : A x B ^ A is defined as 

[{s G Pa'xb^A" \ ytCs. even(|t|) ^t\A'= t\A")} o der^xs] 

From the isomorphisms \{A x B) = \A ® \B and \I = I it follows easily that 
K\{Q) is Cartesian closed [AJM96]. 

Definition 12 (Exponent). The exponent game A ^ B is the game \A -o B. 
The natural transformation Aa,b,c ■ hom{A x B,C) hom{A,B C) is 
A\a,\b,c> euB.c = &v{b^c ° (deris^C x id\B)- 

In order to carry out the proofs of our main theorem, we need to introduce 
the category having as morphisms all the strategies (not only the history- 
free ones). This because we shall use approximations of history-free strategies 
that are not, in general, history-free. We call these morphisms history-sensitive 
strategies. It is worth noting that almost all the definitions in the categories Q 
and coincide. 

Definition 13 (The category of games Q^). The category has as objects 
games and as morphisms, between games A and B, the equivalence classes, w.r.t. 
the relation :^a^b, of the strategies a : A ^ B. The identity, for each game A, 
is given by the (equivalence class) of the copy-cat strategy id a- Composition is 
given as in Q. 
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K\{Q’^) is obtained like K\{Q), since (!,[der], [5]) is a co-monad also over 5®. 
Together with the category , we need to introduce a new relation on strategies 
of 0® which induces a partial order on equivalence classes of strategies. This 
notion can be easily proved equivalent to the standard one. 

Definition 14 (Partial order relation on strategies). Given a game A and 
strategies a : A and r : A (hence such that a ~ a and t ~ t) we define 
aGT4^\/sGa.3t€T.s^t and then [a] E [t] cr G t. 

2 Approximating Strategies 

The argument of this section is the general concept of approximating strategy^ 
which can be seen as a finite approximation of a strategy. It will be used to prove 
that the interpretation of a term is the least upper bound of the interpretations 
of its “approximate normal forms” . 

Definition 15. 1. Let D he a game. A sub-game D' of D (written D' AD) 
is a game such that Mu> C Mr), Ad' = Ad \ Mw, Pd' C Pj) and ~d' = 
Pd' X Pd’ ■ 

2. Let D he a game. We indicate with D” the suh-qame of D in which Pu«- = 
{s G Pd I |s| < n}. 

3. Let A' he a suh-game of A and let a he a strategy for the game A. We write 
a\A' for the strategy {s G ct | s G Pa'}- 

4-. Let a : A —o B he a strategy. We indicate with <t” the history-sensitive 
strategy cr|^ ^ P” and with [cr]” the equivalence class [cr"]. 

Observe that if cr « r then cr” « t”, since equivalent positions have the same 
length. Thus we can write [cr]" with no ambiguity. In general the strategy cr” can 
be history-sensitive also if the strategy cr is history-free. This is because cr" can 
reply to a move a of the Opponent in some position and does not reply in some 
others. In order to accommodate and freely use the strategies cr" we introduce 
the category tj® of games and history-sensitive strategies. The strategies cr” can 
be seen as a finite approximation of the strategy cr, and they will be use to prove 
an approximation theorem along the same line of the works [Hyl76, Wad78j. In 
these works the approximation of a semantical point is obtained through a series 
of projection functions. Here we use a different approach that, in the context of 
games, is simpler and more direct. We need to state a series of properties enjoyed 
by the approximating strategies. The basic ones are the following. 

Proposition 1. For each pair of games A and B and strategy a : A —o B, the 
following properties hold: 

1. cr° = {e} 2. cr” C cr"+l 

3. [JnecoW} = <^ 4. (cr”)”* = cr'”r"{m,n} 

Lemma 1. For each pair of games A and B we have: 

1. {A P)”+i < H" ^ P”+i 

2. b|(H" ^ P™) ®A^B = ^ B) ® A'^ ^ B'^ 
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3 The Fine Structure of the Game Models 

In this section the study of the A-theory {i.e. the set of equations between 
A-terms) supported by the models built in K<(Q) is carried out. The theory in- 
duced by a model is also known as its fine structure. The equations on terms are 
described by means of the equality of some tree of the terms. The trees we con- 
sider are the Levy-Longo trees [Lev75, Lon83] and the Bohm trees [Bar84, Hyl76] . 
We remind briefly the definitions. 

Definition 16 (Trees). Let = {Xxi . . . a;„.T | n S o;}U{T}U{Aa:i . . .Xn-y \ 
n G uj}, let = {T}U{Axi . . . Xn-y \ ri G w}, let xi, . . . Xn, y he variables and let 
M G A be a term. If M is solvable it is intended to have principal head normal 
form Aa;i . . . Xn-yMi . . . Mm- 

1. The Levy-Longo tree of M , LLT{M) is a S^-labelled infinitary tree defined 
informally as follows: 



LLT{M) = 


T 


if M is unsolvahle 
of order oo 


LLT{M) = 


Xxi . . . Xn.L- 


if M is unsolvahle 
of order n 


LLT{M) = 


Xxi...x„.y 


if M is solvable 




LLT{Mi) 


'^T{Mm) 


The Bohm tree of M , BT(M) is a A7^- 


-labelled tree defined informally 


follows: 






BT{M) = 


T 


if M is unsolvahle 


BT{M) = 


Xxi . ..Xn.y 


if M is solvable 




BT{Mi) ■ ■ ■ BT{Mm) 

On Levy-Longo trees (Bohm trees) there is a natural order relation defined by 
LLT{M) C LLT{N) iff LLT{N) is obtained by LLT{M) by replacing T in 
some leaves of LLT{M) by Levy-Longo trees of A-terms or by replacing some 
Xxi...xn.-L by T (BT{M) C BT{N) iff BT{N) is obtained by BT{M) by 
replacing T in some leaves of BT{M) by Bohm trees of A-terms). 

In this work we are interested in categorical models of the untyped A-calculus, 
that is, reflexive objects in a Cartesian closed category. 

Definition 17 (Categorical A- model). 

1. Let C he a category and A,Bg Obj{C). B is a retract of A if there exists a 
pair of morphisms f : A ^ B and g : B ^ A, such that f o g = ids. We 
write {B <\ A,f,g) to indicate that B is a retract of A via f and g. 
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2. A reflexive object is a retract {D ^ D <\ D,f,g), between an object D and 
its exponent D ^ D. We write (D,{f,g)) to indicate that D is a reflexive 
object via morphisms f and g, and call it categorical A-model. 



Definition 18 (Classes of models). Let T> be the class of all the categorical 
X-models {D,{[ip], bP]))> with D ^ I, in the category K\{Q). We partition V in 
the following subclasses: 

1. = {{D, {[ip], [V>])) G I ■0 o « ido} 

2. = {{D, ([(^], [0])) G I 0 o = ei^D and if o p ^ ido} 

3. = {{D, ([(^], [0])) G X> I 0 o + e/^c} 

The main result of this paper states that, given a categorical A-model D G 

the theory it induces is either 

1. the theory induced by the canonical D^o model of Scott [Sco72, Bar84] 
and [Wad78], if D G 

2. 05, the theory which identifies two terms iff they have the same Bohm tree, 
if D G 

3. L, the theory which identifies two terms iff they have the same Levy-Longo 
tree if D G 

The proof proceeds along the same lines of [Bar84, Wad78, Hyl76]. First we 
show that if two terms are equated in one of the above theories then they are 
equal in the corresponding model. In order to prove this, we state an important 
property satisfied by all the models. The approximation theorem says that the 
interpretation of a term is the least upper bound of the interpretations of its 
approximants. The following definitions and lemmata are necessary to state this 
result. 

Definition 19 (Indexed terms). 

1. The set of Xfi-terms, A(f2)(5 M) is defined from a set of variables Var(5 x) 
as M ■.■.= x\ MM I Xx.M \ 17. 

2. The set of (possibly) indexed terms 2l(l7)^(9 M) is the superset of A{{2) 
defined as M ::= x \ MM \ Xx.M \ fi \ M”. 

3. A term is truly indexed if it is of the shape M”. A term is completely 
indexed if all its subterms of the shape variable, abstraction, and application 
are immediate subterms of truly indexed terms. 

Notice that in a truly indexed term the constant 17 does not need to be indexed. 
The reduction rules are extended to indexed terms as follows. 

Definition 20 (Approximate rednction). 

1. The following reduction rules are definable on A{f2): 

(I7i) Xx.Q 17 (17a) 17M ^ 17 
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2. The following reduction rules are definable on indexed terms of 
(17") f?" ^ 17 (f?0) M° ^ n 

(Pi) ((Ax.P")™+igp)'* ^ {P[Q^/x]f iPij) {My 
where b = min{n, m + 1 , h}, a = min{m, p} 



Lemma 2. A completely indexed term Q is f2”‘ f2^ fij fiij -normalizing. 

Denotational semantics is readily defined. The denotation of a pure A-term 
M G Ais defined along the usual categorical definition. To accommodate indexed 
terms we need to introduce two new rules and use the larger categories of games 
and history-sensitive strategies. 

Definition 21. Let D G T> be a categorical \-model. The interpretation of a 
term M G yl(l7)^ (whose free variables are among the list T = {a;i, . . . ,Xk}) in 
D, |M]^ : DI^I ^ D is the strategy inductively defined as follows: 

l^-jD ^ .^r . 

[MATJD I’lMfr ■ = evo{polMj% |fV]D); 

lXx.Ml° =ipo yl(|M]° ^); 

IM"]D = (|M]D)" 

It is immediate to observe that for each term with no indexes M G A, the 
strategy |M]^ is history- free. 

Proposition 2. Let A be a game, {D,{[Lp],[tp])) be a reflexive object in the 
Cartesian closed category of games K\{Q) and a,r : A ^ D be two strategies. 
Let CA^D '. A^ D = {e\ be the empty strategy. Then we have 

1. a° - T = CA^D 

2 . cr"+i • r E (ct • r")"+i 



Theorem 1 (Validity of indexed reduction). Rules (172), (f^")) (f^*^)> {Pi) 
and {Pij) are valid in each categorical \-model D e T>; the rule l7i is valid in 
each categorical X-model D G T>^ U 27®. The validity of a rule 7 is intended in 
the following sense: for each P,Q G 7l(l7)^ if {P Q) then |P]^ E [Qlr- 

Each A-term M can be approximated by a “partially evaluated” term A € 7l(l7) 
which is called an approximant. Different notions of approximants arise for the 
different classes of models. 

Definition 22. For each term M G A the sets of approximants are defined by: 

1. {M) = {A G A{n) I i?r(.4[Z\Z\/l7]) C BT{M) and A is in PrjQiQ 2 ~nf } 

2. El®(M) = {A G A{fl) I BT{A[AA/L2\) C BT{M) and A is in / 3 l 7 il 72 -n/} 

5. A^{M) = {A G A{L2) I LLT(T[Z\Z\/17]) C LLT{M) and A is in pC 2 -nf} 
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Lemma 3. For each categorical X-model D G T>, X-term M and approximant 
A G A^(M) with * G E 

Definition 23 (Erasing function). The erasing function TZ : ^ A{n) 

is inductively defined as follows: 

1. n{x) = x; n{fi) = n 2. n{PQ) = n{p)n{Q) 

3. n\xx.p) = Xx.n{p) i = n{M) 



Lemma 4. For each categorical X-model D G T>* , and for each completely in- 
dexed term M G 4(12)^ there exists a term N G 4(i7)^ such that |M]p C [-^Ir 
andU{N) G 4l*(7^(M)) with * G {£,B,C}. 

Lemma 5. For each categorical X-model D G T>, X-term M and n G ui there 
exists a completely indexed term M* such that: |M”]^ = 

At last we are ready to state the following. 

Theorem 2 (Approximation theorem). For each categorical X-model D G 
T>* and each X-term M, |M]^ = Llil^lr I ^ ^ A*{M)} with * G {S,B,C}. 

From Theorem 2 we can readily conclude that if two terms have the same tree 
they also have the same interpretation in the different game models, that is: 

Proposition 3. For each categorical X-model D G T>, X-terms M, N we have: 

1. i/ D G and LLT{M) = LLT{N) then = |7V]°; 

2. ifDG and BT{M) = BT{N) then |M]D = 

3. ifDG then M =n* N ^ = |A]D. 

In the following part of the section we shall prove that if two terms have 
different Levy-Longo trees or different Bohm trees they also have different inter- 
pretation in corresponding game models of the A-calculus. This will characterize 
completely the theories induced by game models and will substantiate the intu- 
itive impression that the strategy which interprets a term is strongly connected 
with the tree of the term. The following definition and Lemma 6 are standard 
(see for instance [Bar84]). 

Definition 24 (Similar terms). Given two terms M, N G A, we say that M 
and N are similar and we write M ^ N if both M and N are unsolvable or they 
are solvable with principal head normal forms respectively Aa;i . . . Xn-yMi . . . Mm 
and Aa;i . . . Xn'-y'Ni . . . Nm' in which y = y' and m — n = m' — n' . 

Lemma 6. For each compositional non-trivial model of the X-calculus D, for 
each pair of X-terms M,N if M N then {Mjp yf |A^lr- 



The following properties do not necessarily hold for any compositional A-model. 
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Lemma 7. For each categorical X-model D € T>, for each sequence of terms 
M, N, Ml, . . . , Mm, Ni,. . . , Nm, for each sequence of variables x,y,xi, . . . , Xn, 
the following properties hold: 

1. if \xMi . . . Mm}° « {xNi . . . then VI < i < m . |M*]^ « |iVi]p; 

2. z/ D G U then |x]^ 9 ^ IXy.MJp; 

3. z/ D G U and n' < n then 

IXxi . . . Xn-yMi . . . Mrnlr ^ ■■■Xn' -yNi . . . -/Vm'lr/ 

i. if D G and M and N are both unsolvable but of different order then 

IMf^ 96 iNfr- 



Theorem 3. Let D G T> be a categorical X-model and let M and N be two 
untyped X-terms. If |M]^ = |-/V]^ then we have: 

1. LLT{M) = LLT{N) if D G ; 

2. BT{M) = BT{N) ifD gV^. 

4 Conclusions 

In the present paper we have studied the A-theories induced by the game models 
without performing the extensional collapse. Through the extensional collapse 
it is possible to identify strategies that have the same observational behavior. 
In general, the extensional collapse is fundamental in order to obtain fully ab- 
stract game models of programming languages. Therefore it is still possible to 
use game models to capture A-theories that are strictly coarser than the three 
considered in this paper. An example of such a theory can be found in [AM95] 
where, through the extensional collapse of a model D in , a fully abstract 
model of the lazy A-calculus is obtained. However, in general, models obtained 
through the extensional collapse are more difficult to study, e.g. the equivalence 
between strategies is not decidable also in the finite case. Our main theorem de- 
fines precisely those theories that can be obtained using simple (not collapsed) 
game models, and hence it implies also that the theories obtained through the 
extensional collapse lie only in between the theories C and Ti* . 

A second consideration concerns the class of the game models we consider 
in this work. We have focused on games and history-free strategies mainly for 
historical reasons. We claim that the paper can be easily reformulated in order to 
prove the same results for the category of games and innocent strategies [HOOO] . 
We can substantiate our claim by observing that the main tools used in the 
proofs — history-sensitive strategies, approximating strategies. Lemma 7 — are 
not peculiar to the history-free strategies and can be reformulated and applied 
in the context of innocent strategies. 

A final point concerns the construction of game models. In this paper we do 
not build any example of game model for the A-calculus; however in [DGFH99] a 
general method to obtain non-initial solutions of recursive equations is presented. 
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It is then quite simple to find extensional game models: several examples are pre- 
sented there. Non-extensional game models can be obtained through the stan- 
dard tricks used in the setting of the cpo models. For example a non-extensional 
model whose theory is ® can be obtained by taking the initial solution of the 
recursive equation D — {D => D) x A while a model whose theory is £ can be 
obtained by taking the initial solution of the equation D = {D => D)± x A, 
where, in both equations, A is an arbitrary game. 
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Abstract. Typed symmetric A-calculus is a simple computational in- 
terpretation of classical logic with an involutive negation. Its main dis- 
tinguishing feature is to be a true non-confluent computational inter- 
pretation of classical logic. Its non-confluence reflects the computational 
freedom of classical logic (as compared to intuitionistic logic). 
Barbanera and Berardi proved in [1,2] that hrst order typed symmetric 
A-calculus enjoys the strong normalization property and showed in [3] 
that it can be used to derive symmetric programs. 

In this paper we prove strong normalization for second order typed sym- 
metric A-calculus. 



1 Introduction 

The quest for computational interpretations of classical logic, started 10 years 
ago from the work of Felleisen [4,5] and Griffin [8]. It has been shown that 
classical natural deduction allows to model imperative features added to func- 
tional languages like Scheme, Common Lisp or ML. Two particular systems, 
Ac-calculus [4,5] and A^-calculus [12], have been intensively studied and the re- 
lation between features of languages, rules of natural deduction, machines and 
semantics seems to be well understood [9,10,15,16]. 

In the context of sequent calculus, several other computational interpreta- 
tions of classical logic have been constructed following the spirit of Girard’s linear 
logic [6]. It is often claimed in this context that computational interpretations 
of negation in classical logic should be involutive, that is, = A should be 
realized at the computational level. It is even sometimes claimed that this is the 
distinguishing feature of classical logic. But the real computational effect of the 
involutive character is not clear. 

Systems coming from a natural deduction setting, like Ac-calculus or A/u- 
calculus, don’t have an involutive negation. 

The symmetric A-calculus of Barbanera and Berardi [1,2] is a simple compu- 
tational interpretation of classical logic which is explicitly based on an involutive 
negation. Gontrary to A/r-calculus, symmetric A-calculus is non-confluent. But 
this non-confluence is an essential non-confluence which is supposed to reflect 
the computational freedom given by classical logical (compared to intuitionistic 
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logic) . In [3] it is shown that it can be used to derive symmetric programs, which 
cannot be derived in the usual confluent systems. 

In [1,2] Barbanera and Berardi proved that first order typed symmetric A- 
calculus satisfies the strong normalization property. The proof is based on an 
original construction of reducibility candidates using fixed points. 

In this paper we push one step further the understanding of symmetric A- 
calculus: we prove that second order typed symmetric A-calculus satisfies also 
the strong normalization property. Note that the second order setting gives a 
complete kernel of a typed programming language where data types can be 
defined internally. Moreover, from our strong normalization result, it can be 
easily deduced that one can also extract correct programs from proofs in this 
setting. 

The proof mixes ingredients from the proof of Barbanera and Berardi and 
from our proof of strong normalization of second order typed A^-calculus [13]. 

The section 2 is devoted to the definition of symmetric A-calculus and typed 
symmetric A-calculus of Barbanera and Berardi [1,2]. We show in section 2.3 
how to extend this calculus with second order types. 

The section 3 is devoted to the proof of strong normalization. Because we have 
a second order type system we need a notion of reducibility candidate defined 
independently of the notion of type. Because we have an involutive negation 
we also need to define reducibility candidates using fixed points. Our notion of 
reducibility candidates is defined in section 3.1 and its fundamental properties 
are proved in section 3.2: it is proved in particular that reducibility candidates 
are sets of strongly normalizable terms. In section 3.3, we define an interpretation 
of types by reducibility candidates. We finish the proof by showing in section 3.4 
that, if a term has a certain type, then it belongs to the interpretation of that 
type and thus is strongly normalizable. 

In the sequel types are designated by letters A,B,C etc., while atomic types 
are designated by P,Q, R, etc. 

2 The Symmetric A-Calculus of Barbanera and Berardi 

The symmetric A-calculus, introduced by Barbanera and Berardi [1,2], originated 
from a computational interpretation of classical logic with an involutive negation. 
It is basically a A-calculus with a symmetric application. 

In the following, the symmetric application is denoted by *, as in [1,2], and the 
abstraction by /r instead of A, because it corresponds to negation and not to 
implication (see [14] for a discussion of this point) . 



2.1 Pure Symmetric A- Calculus 
Terms Symmetric A- Calculus 

Let Var be an infinite set of variables (denoted x, X\,X 2 , ■■■)■ Terms of symmetric 
A-calculus are defined by: 
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t := x \ * t \ {t,t) I ct 2 (t) 

where x ranges over variables. 

Terms are denoted by letters t, u, v, w. The set of terms is denoted by T. 

Reduction Rules of Symmetric A- Calculus 

(/?) jJLX.U^V C>'^ u[v/x] 

(Z?-*-) U * HX.V t>‘^ v[u/x] 

(tt) {ui,U2) * (Ti{vi) [>° Ui*Vi forte {1,2} 

(tt-*-) CTi(tii) * (mi,M 2 ) i>° Vi*Ui forte {1,2} 

The one-step reduction relation between terms u and v is defined from teh pre- 
vious rules as follows: m i>i w iff u is obtained from u by replacing a subterm ui 
by v\ with u\ i>'^ vi. 

The reduction relation o is defined as the reflexive and transitive closure of the 
one-step reduction relation Oi . 

A term u is strongly normalizable if there is no infinite reduction sequence, i.e. 
no infinite sequence such that uq = u and Ui i>i rti+i, for all i < uj. The 

set of strongly normalizable terms is denoted by JV. 

Comment. Due to the symmetric character of the rules, symmetric A-calculus is 
obviously not confluent. 

2.2 Typed Symmetric A-Calculus 
Types 

The m-types of the system are defined by: 

A-.= P\^P\A^A\A\/ A 
where P ranges over atomic types. 

The types of the system are either m-types or the special type T . 

An involutive negation on m-types is defined as follows: 

-(P) = -P 

^hP)=P 

~^{A A B) = ^(A) V ~^{B) 

~^{A V B) = A ^(P) 

In the following we freely use ~^A instead of 

Comment. The fact that the type T is not among the set of atomic types is 
necessary to have a strong normalization result. It is shown in [14] that if ^ is 
involutive and T is among the set of atomic types, then normalization fails. 
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Typing Rules of Symmetric A- Calculus 



„ 4 1 4 axiom 

1 , X : A h X : A 

r h u: A r h v: B 
r h (u,v) : A A B 

r, X : A h M : _L 

j-4 I 7~ ^-intri 

1 \- IjLX.u : ^A 



r \- : Ai 

Ti \ A A W— intro (i=l,2) 

r h ai[Ui) : AiV A2 



r \- u : -rA r \- V : A 
r h u * V : 1- 






In the previous rules, T denotes an arbitrary context of the form x\ A\, Xn '■ 
An- As usual in typed A-calculi, we adopt in these rules an implicit manage- 
ment of contraction and weakening: weakening is obtained by allowing an arbi- 
trary context in axioms and contraction by merging contexts in rules with two 
premises. 



Barbanera and Berardi proved in [1,2] that this system satisfies the strong nor- 
malization property i.e.: 

if T h u : A is derivable, then u is strongly normalizable. 



2.3 Second Order Typed Symmetric A-Calculus 

We extend the previous typed symmetric A-calculus to second order. 



Terms 

We first extend the definition of terms with two constructions which reflect the 
presence of the two quantifiers V and 3. 

Terms are defined by: 

t := x\ iix.t \ t*t\ (t,t) \ ai{t) I (J 2 (t) I a.t \ e.t 
where x ranges over variables. 



Types 

We start from an infinite set of type variables (denoted X, Y , ...). 
The m-types of the system are defined by: 

A-.= X\^X\AAA\Ay A\ VAA | 3XA 

where X ranges over type variables. 

The types of the system are either m-types or the special type _L . 
An involutive negation on m-types is defined as follows: 

^(A) = ~rX 

^(^A) = A ^{A A B) = ^(A) V ~^{B) 
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^{A V B) = A ~^{B) 

-^{yXA) = 3X^{A) 

-n{3XA) = 

Reduction Rules 

We add two symmetric reduction rules for eliminating a and e. 

(q) a.u * e.v u * v 
(g-*-) e.v * a.u t>‘^ v * u 

The notions of reduction and strongly normalizable term are extended in the 
obvious way to this rules. 



Typing Rules 

We add two introduction rules for the quantifiers V and 3. 
r^u:A[Y/X] r\-u:A[B/X] 

7-? I w A V intro v^/ 7^ I ^ A ^ intro 

1 h a.u : vXA 1 h e.u : 3XA 

(*) Y is not free in B,yXA 

Comment. The constructions a and e are trivial witnesses of the quantifiers at 
the level of terms. They have no real computational effect. Contrary to the case 
of second order typed A-calculus or A/r-calculus, such witnesses are needed for 
second order typed symmetric A-calculus. 

If one takes instead the rules: 

ryu-.AlY/X] B'ru-.AlB/X] 

r y u: yxA r y u: 3a:.4 

then reduction doesn’t preserve typing of the system. The crucial situation is 
the following: 



r, X : ^A[Y/X] h t:A 

r h ^ix.t : A[Y/X] r, y : WXA h s : T 
r h ytx.t : yXA B h fxy.s : 3X^A 

r h y,x.t * yiy.s : T 



2.4 Extensions 

In § 3 we prove strong normalization of second order typed symmetric A-calculus 
presented in § 2.3. The result easily extends in two directions: one can add 
simplification rules and other basic connectives. 
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Simplification Rules 

Symmetric A-calculus of Barbanera and Berardi has in addition to the reduction 
rules presented in § 2.1, other reduction rules, that we call simplification rules: 

jJLX.{u * x) u 

jJLX.{x * u) u 

E[u * v] * w c>'^ u * V 
w * E[u * w] [>° u * V 

These rules are subject to the following restrictions: in the first two rules x has 
no free occurrence in m; in the last two rules, ] is a context which doesn’t 
bind any free variable of u * v. 

Strong normalization for the reduction with simplification rules deduces from 
strong normalization for the reduction without simplification rules. It is suffi- 
cient to remark that: 

1) there is no infinite sequence of reduction using only simplification rules, be- 
cause each application of a simplication rule strictly decreases the length of the 
term; 

2) in reduction sequences, one can always push applications of the original rules 
before applications of simplification rules. 

Additional Connectives 

In § 3 we prove strong normalization of typed symmetric A-calculus based on the 
connectives A and V. The proof extends in a straightforward manner to other 
pairs of dual connectives. One interesting case is the calculus based on ^ and 
its dual — , which is easier ot relate to typed A-calculus and A/r-calculus than the 
original one. 

The typing rules for ^ and — are the following: 

r, X : A \- u : B E \- u : A E \- v : 

E h Xx.u : A ^ B E \- (u,v) : A — B 

The corresponding reduction rules are: 

(u, v) * Xx.t t[u/x] * V 

Xx.t*{u,v) t[u/x]*v 

3 Proof of Strong Normalization 

We prove the strong normalization using the reducibility method: each type 
is interpreted by a set of terms. In section 3.1 we define the set of possible 
interpretations of types, called reducibility candidates. In section 3.2 we prove 
that each reducibility candidate is a set of strongly normalisable terms. In section 
3.3 we define the notion interpretation such that each type is interpreted by a 
reducibility candidate. In section 3.4 we prove that each term of type A belongs 
to the interpretation of A and therefore is strongly normalisable. 
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3.1 Reducibility Candidates 

For C,D G V{T) and S C V{T), one defines the following constructions: 

C X D = {{u,v);u € C,v € D} 

C + D = {(Ti(u); u G C}U {(T2(m); u G D} 

^(C) = {iix.u; for all v G C, u[v/x] G Af} 
p|5 = {a.t; for all C G 5, t G C} 

|J5 = {e.t; there exists C G S, t G C} 

If F : V{T) — > V{T) is an increasing function with respect to set-theoretic 
inclusion, then F has a smallest fixed point denoted by fxX.F{X). 

For C, D G V{T), one defines 
Neg£,(C) = Var U £1 U ^(C) 

For each D G V{F), Neg^, is a decreasing function from V{F) to V{T). Thus, 
for each D' G V{T), Neg^, oNeg^,/ is an increasing function which has a fixed 
point, /rA:.Neg£,(Neg£,,(A:)). 

For F C V(F) X V{T), we define p\F = {C; there exists C , (C, C") G IF} and 
P 2 F = |C'; there exists C, (C, C') G F}. 

Definition 1. The set TZ of reducibility pairs is the smallest subset ofV{T) x 
V{T) such that: 

1) {pX.Neg^{Neg^{X)), Neg^{pX.Neg^{Neg^{X)))) G TZ; 

2) If (C, C') G TZ and {D, D') G TZ, then 
{pX.NegQy,jj{NegQ,_^_jj,{X)), Neg(j,j,_jj,{pX.NegQy^jj{NegQ,j,_j^,{X)))) G TZ; 

3) If % F CTZ, S = piF and S' = P 2 X, then 
{pX.Neg^siNegus'iX)), Neg^s'il'X.Neg^siNeg^S'iX)))) 

4) If{C,C) G TZ, then {C',C) G TZ. 

The set TZq of reducibility candidates is TZq = piTZ = P 2 TZ. 

Comment. Because we have an involutive negation, and A need to have 
the same interpetation. This is achieved by constructing reducibility candidates 
which are fixed points with respect to double negation. Reducibility pairs corre- 
spond intuitively to interpretations of pairs of formulas (A,^A). 

3.2 Properties of Reducibility Candidates 

Lemma 1. If {C,C') G TZ, then one of the following cases holds: 

1) C = Negii,{C') and C = Negfi^{C); 

2) C = NegD^y,D^{C') and C = (C) with {Di, D'f) G TZ for z = 1, 2; 

3) C = Negjj^^jj^{C) and C = Negjj,^,^jj,^{C) with (A, D') gTZ for i=l, 2; 

4) C = Neg(^g{C') and C = Neg^Jg,{C) with S = p\F , S' = P 2 F and F C TZ; 

5) C = Neg^jg{C) and C = iVegp, 5 /(C) with S = piF, S' = P 2 F and F CTZ. 
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Proof. We prove the result by induction on the construction of (C, C). 

1 ) C = ^X.Negg(Neg0(X)) and C' = Neg0(/rX.Neg0(Neg0(X))). 

We have C' = Neg0(C) and C = ^X.Neg0(Neg0(X)); by definition of the fixed 
point, we have C = Neg0(Neg0(C')) and therefore C = Neg0(C"). 

2 ) C = 

and C' = Neg£,,+£,< (/iWNeg£,^xD2(NegD;+D' W))- 

We have C = Neg£,/+£,/ (C) and C = /rX.Neg^^x£,^(Neg£,/+^, (X)); by defi- 
nition of the fixed point, we have C = Neg^,^ x Da -i-£>' (*^)) therefore 

C = Negc^,^,(C'). 

3 ) C = ^iX.Xegr,s(^eg^JS,{X)) and C' = Xeg^JS,{^iX.Xegr,s{^eg^JS,{X))). 

We have C = Neg^Jg,{C) and C = /rX.Negp,5(Negu5/(-^)); by definition of the 
fixed point, we have C = Negp,5(Negu5/(C)) and therefore C = Negp,5(C'). 

4 ) If (C, C") is not obtained by clauses 1 ), 2 ) or 3 ) of definition 1 , then (C",C) 
is obtained by one of these clauses and we are in case 1 ), 3 ) or 5 ) of lemma 1 . 



Lemma 2. Let (C, C") G TZ and u G T. 

Then jjLx.u & C iff jjLx.u G ~^{C) 

iff for all V G C', u[v/x] G Af . 

Proof. Let {C,C) G TZ and m G T. By lemma 1, we have C = Neg£,(C") = 
Var U D U ^(C'), with D being 0, if x F, if -|- F, p] 5 or IJ 5, with F, F G Fq 
and S C TZ() . Because Var U D doesn’t contain terms starting with a /i, we have 
jjLx.u G C iff jjLx.u G ^(C"). 



Lemma 3. If C &TZq, then Var V C C Af. 

Proof. First remark that, for each C G TZq, we have by lemma 1, C = Neg£,(C") = 
Var U F U ^{C) and therefore Var C C. 

Let C €TZo and t € C. We prove t G Af. By lemma 1, we have C = Neg£,(C") = 
Var U F U ^(C"), with C' G TZq and F being 0, F x F, F -|- F, P 5 or IJ 5, with 
E,F G TZq and S C TZq. Therefore one of the following cases holds: 

1) t G Var. In this case, t G Af. 

2) t G ^(C"). In this case, t = fix.u and for all v G C', u[v/x\ G Af; because 
Var C C', we have x G C and u G Af; therefore fix.u G Af. 

3) t G D. One considers the possibilities for F given by lemma 1. 
a,) D = E X F, with F, F G TZq. 

In this case t = {u, v) with u G C and u G F; by induction hypothesis, u,v G Af 
and therefore t G Af. 

b) D = E + F with F, F G TZq. 

In this case t = a\ (u) with u G E or t = a^iv) with u G F; by induction hypoth- 
esis, u,v G Af and therefore t G Af. 

c) F = P 5 with S C TZq. 

In this case t = a.u with u G C for all C G 5; by induction hypothesis, u G Af 
and therefore t G Af. 
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d) D = \JS with S C TZo- 

In this case t = e.u with u G C and C G 5; by induction hypothesis, u G Af and 
therefore t G Af. 

Lemma 4. Let (C, C") G TZ and u, u' G T. 

If u G C and u [>i u' , then u' G C. 

Proof. Let {C, C) G TZ and u,u' gT such that u G C and u Oi u' . One proves 
m' G C by induction on the construction of (C, C). 

By lemma 1, we have C = Neg£,(C") = VarUZ?U^(C"), with D being 0, C\ x C 2 , 
C\ + C 2 , OI' U‘^> C\,C2 G TZo and S C TZq. 

One considers the different possibilities for u. 

If u G Var, the result is trivial. 

Suppose u G Then u = fix.t with t[v/x] G Af, for all v G C , and u' = iix.t' 

with t [>i t'. Let V G C; since t[v/x] G Af and t >1 t' we have t'[v/x] G Af. 
Therefore jix.t' G C, i.e. u' G C. 

Suppose u G D. One considers the possibilities for D given by lemma 1. 

1) D = Cl X C 2 with C\,C2 G TZo. In this case u = {u\,U2) with u\ G Ci and 
U2 G C2. There are two possibilities for u': either u' = (u'i,U2) with ui >1 u'l 
or u' = (mi,M2) with U2 t>i u'2. By induction hypothesis we have m' G Ci and 
therefore u' G C. 

2) D = Ci+C 2 with Cl, C 2 G TZo. 

In this case there exists i G {1,2} such that u = aiUi with Ui G Ci. Since u [>i u', 
we have u' = aiU^ with Ui [>i m'. By induction hypothesis we have u^ G C\ and 
therefore u' G C. 

3) C = n5 with S C TZo. 

In this case u = a.t with t G if for each E G S. Since u [>i u', we have u' = a.t' 
with t [>i t' . By induction hypothesis we have t' G E for each E G S and 
therefore u' G C. 

4) D = U5 with S C TZo. 

In this case there exist E G S and t G E such that u = a.t. Since u >1 u', we 
have u' = a.t' with t >1 t' . By induction hypothesis we have t' G E and therefore 
u' G C. 

Lemma 5. Let (C, C') G TZ and u, u' G T. 

If u G C and u' G C , then u*u' G N . 

Proof. Let (C, C') G TZ, u G C and u' G C . By lemma 3, we have u G Af 
and u' G N. Let N{u) (resp. N{u')) be the sum of the lengths of the reduction 
sequences of u (resp. u'). We prove u * u' G N hy & double induction on the 
construction of (C, C') and N{u) + N{u'). 

In order to prove u*u' G Af we prove: for all w G T, ii u*u' i>i w then w G Af. 
We consider the different possibilities for w. 

1) w = t[u' /x] with u = )j,x.t. 

By lemma 2, we have fxx.t G ^(C') and therefore t[u' /x] G Af. 

2) w = t'[u/x] with u' = jjLx.t' . 

The proof is analogous to that of case 1). 
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3) w = Ui*u^ with u = (ui,U 2 ), v! = and i G {1,2}. 

By lemma 1, we have C = Neg<;;^xC2(^0) = Neg<;;/_|_^/ (C) with (Cj, C') G TZ, 

Ui G Ci and u[ G C'. Since Ui G C\ and m' G C', we have by induction hypothesis 
* M ■ G Af. 

A) w = Ui*u[ with u = (Ti{ui), v! = {u'i,u'r^ and i G {1,2}. 

The proof is analogous to that of case 3) . 

5) w = t *t' with u = a.t and u' = e.t' . 

By lemma 1, we have C = Negp, 5 (C'), C = Negu 5 /(C) with 
S = {£); there exists D' , {D,D') G T}, S' = {H'; there exists D, {D,D') G T}, 
IF C 7?., t G H for all H G 5 and t' G D'q for a certain Dq G 5'. Let Dq such that 
(Do,D'q) G IF; we have t G Dq and t' G D'q; by induction hypothesis, it follows 
t * t' G Af. 

6) w = t*t' with u = e.t and u' = a.t' . 

The proof is analogous to that of case 5). 

7) w = ui * u' with u >1 Ml. 

By lemma 4, we have mi G C. Because N{ui) < N{u), we have by induction 
hypothesis ui * u' G Af. 

8) w = u*u'i with u' >1 m{. 

The proof is analogous to that of case 7). 



3.3 Interpretation of Formulas 

Let Z\ be the set of type variables and negated type variables. 

Definition 2. A valuation a is a function from A to TZo such that for each type 
variable X , (a{X), a(^Al)) G TZ. For U € A and C G TZo, we denote by a[C/U], 
the valuation a' defined by a'{U) = C and oi' iV) = ce{V) for V ^ U . 

The value ||A||“ of an m-type A for a valuation a is defined inductively as follows: 

11X11“ = a{X), for X a type variable; 

||^X||“ = a{^X), for X a type variable; 

||A A i?||“ = 7i-^.Xeg||^l|,^x||B||“(-^e5|hA||“+|hB||“(-’^)) 

||AVi?||“ = Xe5||^l|.+l|s||.(|hAA-B||“) 

||VXA||“ = ^X.Xegp,{||^ll<»[c/x.c'/-v]. ;c,C')en} 

||3XA||“ = Xeg^_j{im|c[c/x.c'/-x]. 

The definition is extended to types by || T ||“ = A7. 



Lemma 6. For each valuation a and each m-type A, (||A||“, ||^A||“) G TZ. 

Proof. Easy induction on A. The case where A is a type variable is given by the 
definition of valuation. 
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Lemma 7. Let A, B be m-types and a, a' valuations. 

(1) Suppose that a{X) = a'{X) and a{^X) = a'{~^X), for each type variable X 
free in A. Then ||4l||“ = ||^||“'. 

(2) \\A[B/x\\\^ = p||“[iiBir/^. ihBir/-x]^ 

Proof. Easy but tedious inspection (this lemma says only that our notion of 
value is correctly defined). 



3.4 Proof of Strong Normalization 

Lemma 8. Let Ai, be m-types and C a type. 

If xi : Ai,...,Xn '.An b t -.C, then for all ui € ||v4i||“, G ||4l„||“, 

t[ui/xi, ...,Un/Xn] G ||C||“. 

Proof. By induction on the derivation of xi : Ai,...,Xn ■ An \~ t : C. One 
considers the different possibilities for t. 

1) t = Xi and C = Ai. In this case, we have t[ui/x\, ..., Unjxn] = Ui G 11011“. 

2) t = (ti, 0) and O = Oi A O 2 . By induction hypothesis we have 
ti[ui/xi, ...,Un/xn] G 110111“ and t 2 [ui/xi, ...,Un/xn] G ||02||“. Therefore 
t[ui/xi, ...,Un!xn] G ||Oi||“ X ||02||“ and t[ui/xi, ...,Un/Xn] G ||Oi A 02||“. 

3) t = ai{ti) with i G {1,2} and O = Oi V O 2 . By induction hypothesis we have 
U[ui/xi, ...,Un/xn] G ||Oi||“. Therefore t[ui/xi, ...,u„/xn] G ||Oi ||“ + ||02||“ and 
t[ui/xi, ...,Un/Xn] G ||Oi V 02||“. 

4) t = ti * ^2 and O =T. In this case T is obtained from Oi and ^Oi. By induc- 
tion hypothesis we have ti[ui/ x\, ...,Un/ Xn] G ||Oi||“ and 

t 2 [ui/ x\, ...,Un/ Xn] G ||^Oi||“. Therefore by lemmas 6 and 5, 
t[ui/xi,...,Un/Xn] G N l.B. t[ui / Xi, ...,Un/ Xn] G || T ||“. 

5) t = pLx.s and O = ~^A. In this case t[ui/xi, ..., u„/a;„] = jjLx.s[ui/ x\, ..., 

By induction hypothesis, we have s[ui/ x\, ...,Un! Xn,v / x] G || T ||“ = Af, 
for all v G ||A||“. Therefore by lemma 2, /jx.s[ui/xi, ...,Un/xn] G |H^||“ i.e 
t[ui/xi, ...,Un/Xn] G ||0||“. 

6) t = a.s and O = 'iXA. In this case yXA is deduced from A[y/Ar] with Y not 

free in Ai,..., An, ^XA. We have to show a.s[ui/a:i, ..., u„/a:„] G ||VXyl||“. By 
definition of ||VX^||“, it suffices to show s[rti/a;i, ..., G 

for all (O, O') G TZ. Let (O, O') G TZ. Because Y is not free in Ai , ..., An, we have 
by lemma 7, || /^v] _ || 2 lj||“ and therefore Ui G for 

each i G {1, ...,n|. By induction hypothesis, 

s[ui/xi, ...,Un/xn\ G || A[E/ AT] || ^ Because Y is not free in \/XA, 
||A[y/Ar]||“['^/OC' /^y] _ ||^||a[c/x,c /^x] therefore 
s[ui/xi,...,Un/Xn] G || A|| . 

7) t = e.s and O = 3XA. In this case 3XA is deduced from A\B/X], for a 

certain type B. We have to show e.s[ui/a;i, ..., u„/a:„] G ||3ArA||“. By defini- 
tion of ||33f2l||“, it suffices to show that there exists (0,0') G TZ such that 
s[ui/xi,...,Un/xn] G || A|| . Let (0,0') = (||B||“, ||^B||“). By in- 
duction hypothesis we have s[ui/a;i, ..., G ||A[i3/Ar]||“. By lemma 7, we 
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have ||Al[i?/X]||“ = and therefore s[ui/ x\, ...,Un/ Xn] G 

P||a[C/Jf,C7-X]^ 

Theorem 1. If x\ : Ai, ...,x„ : A„ h t : C, then t is strongly normalizable. 

Proof. Suppose X\ : Ai,...,Xn : A„ \~ t : C. For each i G we have 

||Ai||“ G TZq by lemma 6 and Xi G ||Gli||“ by lemma 3. Therefore by lemma 

8, t G ||C'||“- If C =_L, then ||C||“ = Af and t G Af; otherwise by lemma 6, 

||C||“ G TZo and therefore by lemma 3, t G Af. 
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Abstract. We propose a polynomial time approximation scheme for 
scheduling a set of dedicated tasks on a constant number m of processors 
in order to minimize the sum of completion times Pm|fixj| ^ Cj. In ad- 
dition we give a polynomial time approximation scheme for the weighted 
preemptive problem with release dates, Pm|fixj,pmtn, rj | WjCj. 



1 Introduction 

In the last few years, an important amount of work is devoted to the study of 
scheduling problems in which the objective is to minimize the sum of comple- 
tion times. In [I], the authors presented the first polynomial- time- approximation- 
schemes (PTASs) for scheduling to minimize the average weighted completion 
time (in the presence of release dates) in various machine models including one, 
identical parallel, unrelated parallel machines, with and without preemption. In 
all these models each task is processed on at most one machine at a time. On the 
contrary, no PTAS was known for scheduling problems, in which the objective 
is to minimize the average completion time, involving multiprocessor tasks i.e. 
tasks that may require more than one processors at a time. 

In this paper, we propose the first PTASs for the dedicated multiprocessor task 
model in which the objectives are the minimization of the average completion 
time in the non-preemptive case, and the average weighted completion time in the 
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preemptive case in the presence of release dates. For the dedicated multiprocessor 
model, the only known PTAS was for the case where the objective is to minimize 
the makespan [2]. 

Using the standard three field notation [5], the problem of scheduling dedi- 
cated tasks on a set of processors in order to minimize the sum of task completion 
times is denoted as P|fixjj ^ Cj. This problem was studied for the first time by 
Hoogeveen et al. in [6], where it was shown to be A/”7^-hard in the strong sense, 
even in the case where all the tasks have unit execution times. Cai, Lee and Li [4] 
proved that the problem is also strongly A/”7^-hard, even in the case where there 
are just 2 processors. On the other hand, in [3], Brucker and Kramer proved that 
the problem is polynomial in the case where the tasks have unit execution times 
and the number of processors is a fixed constant, i.e. for Pm\&Xj,pj = 1| ^Cj, 
as well as when in addition the tasks have release dates. In terms of approxima- 
tion algorithms, in [4], a 2-approximation is given for the 2 processor problem 
P2\fiXj\ ^ Cj. In this paper an approximation scheme is given for the m proces- 
sor problem Pm\fixj \ ^ Cj with m constant, via a reduction to the preemptive 
version of the problem. 

When the tasks can be preempted, the problem is a bit easier. In fact the 
2 processor case P2\fixj,pmtn\Y^Cj is polynomial [4]. We present an ap- 
proximation scheme for the generalization where the tasks have weights and 
release dates, and the goal is to minimize the weighted sum of completion times, 
Pm\rj,pmtn\ ^ WjCj. Note that Labetoulle et al. [7] proved that the single pro- 
cessor version of this problem, Pl\rj,pmtn\^WjCj, is already strongly MV- 
hard. 

In section 2, we present a reduction from the non-preemptive problem 
Pm\fixj\ ^Cj to the preemptive problem Pm\fixj,pmtn\ ^Cj. In section 3, 
we present an approximation scheme for the (more general) preemptive problem 
Pm\fixj, pmtn , Vj \ Wj Cj . 



2 A Reduction from Non-preemptive to Preemptive 

Formulation of the problem. We are given a set of n tasks T = {1, 2, . . . , n} 
and a set of m processors M . The tasks are dedicated: each task j requires for 
its execution the simultaneous availability of a prespecified subset of processors 
Tj C M for pj units of time. The set Tj is called the type of task j. We denote by 
Sj and Cj the starting and completion times of task j. The problem is to design 
a schedule which minimizes the sum of task completion times, X]j=i 

2.1 The Algorithm 

Let U B denote the upper bound to the optimal cost obtained by processing all 
the tasks sequentially in order of non-decreasing processing times. Note that this 
upper bound is within a constant factor m of optimal. The lemma below shows 
how to separate tasks into “long” tasks and “short” tasks, with a few (negligible) 
“medium” tasks in between. Its proof is a simple algebraic manipulation. 
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Lemma 1. Let Li denote the set of tasks j with processing time pj > 

Mi denote the set of tasks j with processing time < pj < 

and Si = T \ Li U Mi denote the remaining tasks. Lf L\ 0, then there exists 

an i < log]^_|_j(l/e^°) such that OPT{Li U Mi) < (1 + e^)OPT{Li). 

At a high level, the algorithm is as follows. 

1. For every i < logi_|_g(l/e^°), partition the tasks into T = Li U Mi LI Si and 

construct a schedule of T as follows. 

2. Construct a non-preemptive schedule of Si as follows. 

(a) Solve the preemptive problem for Si with relative error e. 

(b) For each subset r of the machines, consider all the time intervals during 
which the schedule executes tasks of type r, and reorder the tasks of 
type r in these time intervals by order of increasing processing time. 

(c) Stretch time by a factor of (1 + e), so that each task in the schedule 
corresponds to an interval or a set of intervals (if preemption occurred) 
of total measure (l + e)pj; leave the first epj section idle and process task 
j during the last pj part of the interval or set of intervals. 

(d) Modify the schedule to reduce the number of preemptions in the follow- 
ing way. In each interval / = [(1 -I- e)^, (1 -I- e)^+^], look at the schedule 
during that interval, erasing the names of the tasks and only remem- 
bering the task types ; at every instant a certain set of task types are 
being processed; call this a configuration. We reorder the schedule inside 
I so that identical configurations are contiguous; put the tasks back in, 
in order of increasing processing time. Note that in this modified sched- 
ule, a task can only be preempted when a configuration ends, thus the 
number of preemptions in the interval is at most m times the number 
A of configurations, which is 0(1) since the number m of processors is 
bounded and configurations, which are just partitions of {1,2 ,..., to} 
into task types, are also in constant number. 

(e) Modify the schedule to make it non-preemptive in the following way: for 

each preempted task j, if its completion time t is greater than Apjje'^, 
then finish executing j there, otherwise remove task j and insert it in 
the gap at time (1 -I- e)* such that (1 -I- e)*“^ < < (1 + e)*- 

(f) Remove all times during which all processors are idle from the resulting 
schedule. 

3. Construct an optimal non-preemptive schedule of Li U Mi by exhaustive 

search. 

4. Concatenate the schedule of Si and the schedule of Li U Mi. 

5. Output the best resulting schedule, over all choices of i. 

2.2 Analysis of Running Time 

There are only 0(1) possibilities for i. For each choice of i, we partition the 
tasks in 0(n), run the preemptive approximation scheme once, reorder the tasks 
of the same type in O(nlogn), perform the rest of step 2 in 0(n), and con- 
struct an optimal schedule of Li U Mi in time 0{\Li U Mi|!). Since Li U Mi 
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consists of tasks with processing time greater than there can be 

at most such tasks. By Lemma 1, z < logi_|_j(l/e^°) and so \Li U 

so that \Li U Mi\\ = 0(1). Thus the overall running time is 
0(n log n) + 0(preemptive approximation scheme). 



2.3 Analysis of the Sum of Completion Times 

At the end of step 2a, the schedule of Si has cost at most (1 + e) times the 
optimal preemptive schedule cost, which is at least as good as the non-preemptive 
schedule, hence the schedule of Si has cost as most {1 + e)OPT{Si). Step 2b can 
only decrease the cost. Step 2c increases the cost by a factor of (1 + e). Since 
step 2d only modifies the completion times inside the intervals, it also increases 
the cost by a factor of (1 + e) at most. 

Step 2e is more difficult to analyze. First, finishing all the short preempted 
tasks of interval Ik = [(1 + e)^, (1 + e)^+^] creates a delay of at most pj < 
e^(l + e)^+^/A for each of the A configurations, adding up to a delay of at most 
e^(l + e)^+^ due to interval Ik- Thus a task j completed in interval /; is delayed 
by intervals /i, / 2 , . . . , Ii-i, for a total delay of at most 

e" + e"(l + e) + . . . + e"(l + e)'"! < e(l + e)' < eC,. 

Thus these delays increase the cost by a factor of (1 + e) at most. 

Secondly, the long tasks displaced in step 2e may also cause further delays. 
A gap at time t' = (1 + e)^ receives only tasks previously scheduled preemptively 
to complete at time t < /m. These tasks use up a space of at most in 

the gap at t' , which again sums to a negligible delay in the schedule, a factor of 
at most (1 + e). 

Thirdly, the long tasks displaced see their own completion times greatly in- 
creased. Call V the set of such displaced tasks. They were displaced because their 
completion time in the preemptive schedule was smaller than Apj/e^, and the 
displacement increased their completion time by a factor of m/e^, thus the sum 
of their new completion times is at most 'mApjle^. But their processing 

times sum to at most m times the makespan of the schedule S. 

Let Pma,x{Si) be the maximum processing time of Si and A4 the makespan of 
S. Considering that at least M/{2p^ax) tasks will be executed during the last 
Ai/2 steps of S and hence have completion time greater than Af/2, we obtain 
that the cost of S is at least M^/{Ipmax), hence 

M < 2^p^,xCOST{S) < 2 Vei°OPr(T)(l + efOPT{Si)/m. 

Thus the new completion times of the tasks of T> sum to at most 2Aey/mOPT{T). 

Finally, at the end of step 2, the non-preemptive schedule X of Si has cost 
at most (1 -I- e)^OPT{Si) + 2A-JmeOPT{T) and makespan at most A4{X) < 
2i/p^,x{S,)C0ST{X). 

Step 3 constructs an optimal schedule 3^ of U Mi of cost OPT{Li U Mi) 
which is at most (1 -I- e)^OPT{Li) by Lemma 1. 
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Step 4 concatenates the two schedules, for a cost of COST{X) + COST{y) + 
\Li U Mi\M.{X) which we now need to analyze. Li satisfies 
OPT{Li) > |LjpPmin(i*)/(2TO^), thus 

{L,M{X)f < 2m" Ap^USi)COST{X) < ^m^{l + efe^OPT{Tf 

since the processing times in Li and Si differ by a factor of e® at least. 

Moreover, Mi satisfies OPT{Mi) > |Mippnii„(Mi)/(2m"), and moreover 
OPT{Mi) < OPT{L, U M,) - OPT{Li) < e^OPT{T), thus 

(|M,|A4(A’))" < 2m"e" Ap^^^{S{)COST{X) < Sm^e\l + efOPT{rf. 

Thus the concatenated schedule has overall cost 

(1 + efOPT{Si) + 2^AeOPT{T) + (1 + e")OPT(L,)+ 

2v^m(l + e)"-®e"-®OPr(T) + 2v^me(l + efOPT{T). 

Since OPT{Si) + OPT{Li) < OPT{T), we obtain that the cost of the schedule 
is (l + 0(e))0PT(T). 

3 Solving the Preemptive Problem 

In this section we present a PTAS for Pm\fixj,rj,pmtn\ '^WjCj with release 
dates rj and weights Wj for each task. First, using ideas in [1] and new ideas we 
simplify the problem instance. Then, we apply a dynamic programming tech- 
nique to find an approximative schedule. Inside of the dynamic program we use 
an optimal algorithm of Amoura et al. [2] for Pm\f ixj, prat n\C max (makespan 
optimization) to test whether tasks can be processed in an interval or not. In 
total, we prove the following result: 

Theorem 1. There is a PTAS for Pm\fixj,rj,pmtn\^WjCj that eonstructs 
a 1 + e- approximation in 0(n log n) time (with m and e > 0 constant). 

As in section 2 we partition the time (0, oo) into disjoint intervals of the form 
Ix ■= [Rx, Rx+i) with Rx+i = Rx{^ + d) and Ix = eRx (we use R to refer to both 
\Ix\ and Ix). Let C T denotes the set of tasks with the same type t C M; T)f 
is the set of tasks in that are released at Ix. Let Cj be the completion and 
Sj be the starting time of task j. The values x{j) and z{j) denote the indices of 
the intervals Ix(j) and R^j) where task j is released and completed, respectively. 

First, we simplify the problem instance. With at most 1 -I- e loss in the ob- 
jective function, we can assume that all release dates rj and processing times 
Pj are integer powers of 1 -I- e; rj > epj and pj > 1 [1]. As consequence, the 
processing time of each task j is at most ^ times more than the length of the 
interval where this task is released (i.e. pj < This means that every task 

crosses at most a constant number of intervals. 
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Furthermore, we can assume that all quotients Pj/wj are different. Let 
Pji — Ph / '^h ^ ^ Pjn / '^jn ■ Suppose that some tasks have the same val- 

ues pj / Wj . In this case, we can increase the weights such that wj < Wj < (H-e)wj 
such that all quotients pj /wj are different. The objective value of a schedule with 
the new weights ^ w' Cj is bounded by (1 + e) X] Wj Cj . Finally, we can rearrange 
tasks inside an interval and consider ^ WjRz^j) instead of our original objective 
function 

Now we introduce two types of tasks. Using the assumption above, every 
release date rj is the left endpoint of an interval Ix(j)- A task j is large, if the 

processing time pj is larger than and is small otherwise. Let be the 

set of large tasks and ST^. the set of small tasks in TJ . We may assume that 
e < 2 ^ and log^_|_^ i, ^ are integral. 

For an optimal schedule and tasks in TJ we can assume that each task is 
processed completely before another one starts to be processed. In other words 
for t,£ gTx have Ct < Si or Ci < St- 

Lemma 2. With at most I -I- 0(e) loss, we can assume that each task in ST^. 
is processed completely in one interval. 



Proof. Fix an interval ly where this property does not hold. Consider one pro- 
cessor type T C M. Using the oberservation above it follows that there is at 
most one small task jx released in interval Ix with x < y, such that jx is started 
in ly with X < y but finished later. 

The goal is to complete all these small tasks (among all previous intervals 
Ix) in ly. The total processing time of these tasks can be bounded as follows: 



x<y 2i<y 



- A.2-(l + e)* 



< y ^ ^ ^ 

“ 2"* ^ (l + e)* “ 2™ “2"* 



Adding the bound for all types t C M, the total time to complete all these small 
tasks is at most 2e/y. To create 2e/y idle time, we shift the entire schedule two 
intervals forward. This increase the objective function by at most 1 -I- (1 -I- e)^. 
Using these idle times and preemptions we are able to reschedule and to complete 
the small tasks within ly. _ 



The tasks in STf. are scheduled by Smith ’s rule if they are scheduled in order 
of increasing We say that two tasks t G and t' G SPy,-^ (with 

release indices x{t') < x{f)) are scheduled by Smith’s rule if one of two following 
conditions holds: 

(1) t' is completed before t is released, 

(2) t' is not started before t is released and 

(2.1) if ;^ < ^ then t' starts only after t is completed, 

(2.2) if ;^ > then t starts only after t' is completed. 

The following lemma gives us a powerful tool to handle small tasks. 
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Lemma 3. With at most 1 + 0(e) loss, for each processor set r we can assume 
that all small tasks in are scheduled by Smith’s rule. 



Proof. Consider an optimal schedule where no small task crosses an interval and 
where the quotients Pj/wj are different. Let S'’’ be the set of all small tasks with 
processor set t, and let Sf C S'’’ be the subset of small tasks that are executed in 
Ix. Then, define p{Ix,t) = as the total time to process small tasks in 

Ix. Furthermore, x{j) denotes the index of the interval where task j is released 
and L denotes the index of the last interval. For any type t C M, we study the 
following linear program: 

Minimize Wj T.f=xU) s.t. 

(1) ElxU) = 1. Vj G 

(2) Ei :jes^, x{j)<i y^^Po < P{Ii,T), VL and r C M, 

(3) yj,i > 0, Vj G 5^, z = x{j), ...,L. 



First, the objective value of the linear program is not larger than the weighted 
average completion time for the small tasks and the fact that the fractional 
assignment gives only a smaller value. In other words, the value of an optimal 
fractional solution is a lower bound of the weighted completion time. Consider 
an optimal solution (z/jj linear program. Suppose that two tasks t and 

t' are scheduled not by Smith’s rule. Without loss of generality we suppose that 



Vht > 0. yt'.^,, > 0. x{t') < x{t) <it< if and ^ 

Then, there exist values Zt and Zf such that 0 < Zt < yt,it, 0 < Zf < y^, 
and ZtPt = ZfPf. Now we exchange parts of the variables: 



= yht - = y*t,i,, + zt 

y'f.z,, = y*f,i,, - Zf y'f^i, = y*f,u + z^t' 

and y' j = y* ^ for the remaining variables. The new solution (y^) is feasible 
and the objective value EjeS- 'Ei=x{j) is equal to 'ELxU) 

yl^R^ + Rt,f where Rt,f = {R^- Ri^,){wf Zf - ZtWt). Using Zf = Zf^, ^ 
and Zt > 0, the second factor {wfZf — ZfWt) = Zt{wf^ — Wt) is larger than 
0. The inequality it < if implies Ri^ < Ri^, and Rty < 0. In other words, the 
new solution (y' J has a lower objective value and gives us a contradiction. This 
means that the two tasks t and t' are scheduled by Smith’s rule. 

Now we use some properties about the optimal solution of the linear program 
above. There is an optimal solution such that for each interval R we have at most 
one task j G S'’’ with xj^t G (0, 1) and that is assigned for the first time. Otherwise 
we can use the same argument as above (and the fact that the quotients ^ are 
different) to improve the objective value. To turn the fractional solution into an 
integral, we need only to increase the values p(/j,T) by at most ^ (because all 
tasks are small). Thus for all r C M we have to create at most 2eli idle time. 
Then we shift the schedule two intervals forwards and use the created idle time 
to reschedule small tasks by Smith’s rule. . 
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Let p{T') be the processing time of all tasks in T' C T. By delaying of tasks 
to later intervals we can bound the number of long tasks and the total length of 
small tasks for each interval Ix'- 

Lemma 4. With 1 + 0(e) loss, each instance I of Pm\fix,rj,pmtn\^WjCj 
can he transformed in 0(n log n) time into an instance I' such that for each type 
T C M and release date Rx-' 

— ICRxl < K := 2^, where k = 51ogi_|_j 

- p{STl) < 2Ix. 

Proof. Consider an interval R- Using Lemma 3 we order by Smith’s rule. In 
the interval Ix the total available time to schedule tasks from TJ is R- Thus we 
select tasks from until the total processing time of selected tasks is greater 
or equal to Ix- Since pj < I x for each job j in STf. we have at most 21 x for 

selected tasks. Within large tasks in CTf. of the same size we select at most 
tasks in order of decreasing weights (only they can be started in Ix). We have 
at most selected large jobs. After that we increase the release time of not 
selected tasks. 



The difficult part in the dynamic programming is to show that it is sufficient 
to maintain informations for a small number of tasks. To do this we introduce 
a compact representation of small and long tasks. We start with the small tasks 
and assume that the small tasks in STf. are ordered by Smith’s rule (i.e. in 
increasing order of pjfwj). Then we select the tasks one by one and create 
sets STf. ^ C 5T^, 1 < i < (the last sets may be empty) of lengths 

roughly equal to (but not greater than ^^). We always create a new set 

when the total processing time of tasks in ST^. ^ and the last selected 

task is greater than . This last selected task is placed into The 

following Lemma shows how the small tasks can be scheduled without increasing 
the objective function too much. 

Lemma 5. With 1 + 0(e) loss we can assume that in each interval ly, y > x 
for all subsets t C M, either 

— a consecutive sequence of task sets (at least one set) STf.ay! • • •> 

is scheduled in ly, or 

— all tasks in STf. have already been scheduled. 

Proof. Fix one processor set t C M and consider all small tasks that require r. 
Using Lemma 3, these small tasks are scheduled by Smith’s rule. Next consider 
the first interval ly, y > x where the properties in the Lemma above does not 
hold. Then there is one set STf. that is not completely scheduled in ly (or there 
is no task from 5T^). If we increase the processing time p{Iy,r) by an amount 
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of (for each such I^) then the sets can be scheduled completely in 

ly. The total enlargement for all Ix, x < y, is bounded by 



p(st:.o < E < E 



x<y 



x<y 



t>0 



2"*(1 + e)* 



< 



e(1 + e)Iy 



2e . 
< — T 
- 2 '- 



y 



This implies that we have to increase the processing times p{Iy,T) by at most 
^ly For all processor sets t C M we have to create at most 2ely idle time to 
complete the small tasks from previous intervals in ly. Again we create 2ely idle 
time by shifting the schedule two intervals forward. 



The schedule type above allows us now to represent the set ST^. in a more 
compact way by a set ST^ with at most T = ^ new created small tasks. Each 
task Tx^i G ST^ represents the corresponding set The processing time of 

Tx^i is equal to p{STl. j) and the weight Wx,i of Tx^i is equal to 
Finally the new tasks have to be processed in the total order Tx,i, Tx^ 2 , ■ ■ ■ , Tx^f- 
Finally we use a similar idea for the large tasks in We notice that 
contains at most a constant number of at most K = 2^ large tasks where 
k = 5 logi_|_£ i and the processing time pj of each large task j in is bounded 

by ff. 

Lemma 6. With 1 + 0(e) loss, we can assume that in each interval ly, y > x 
for all subsets t C M , the partial time pj^y that is used in ly to process a large 
task j G CTf., either 

- is equal to IpyAx, where Ax = ij,y € {1, ■ • ■ ,H}, H := = 

^2^ and k = 51ogi_|_g b, or 

— j has been already completed. 

Proof. Notice that the total enlargement needed in ly for tasks from CTf., x < y 
is at most since there are at most K = 2^ large tasks in £T^. The rest 
follows as in Lemma 5. 



Corollary 1. All tasks in Tx = UrQMXf are scheduled completely within the 
next 0(s) := max{T, H} intervals following Ix. 

For the dynamic programming, we introduce now a block structure on the 
time line. The basic idea is to decompose the time line into a sequence of blocks. 
Let A = {oi, . . . , Or} be the indices of release dates Ray ■ • • , Ra^ with oi < 02 < 
. . . < Or. Corollary 1 implies that if Oj+i — at > 0{s) then all tasks that are 
released at Ra, can be scheduled in the intervals la,, . ■ . , Iai+o(s) (further we 
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consider only them). Thus, to find an optimal restricted schedule we have to 
consider only nO{s) intervals. 

Let B = {oi, tti + 1, . . . , Or+i} be the indices of the corresponding intervals 
(at most nO{s)), where a^+i = ar + 0{s). We partition the set B into a sequence 
of blocks B\,...,Br where r is at most n and each block Bi, i = 1, . . . ,r consists 
of 0(s) intervals with indices from B. Notice that the tasks that are released in 
the intervals of block Bi either finish in Bi or in BiJ^i, this set of tasks is denoted 
by BT i. Furthermore, there is at most a constant number /i := 2™TO(s) of small 
and a constant number v := 2'^KO{s) of large tasks in BTi. 

Let Qi be the different ways how the tasks from BTi are scheduled in the 
intervals of Bi U Bi+\. Each small task is processed completely in one of the T 
intervals after its release. Each large task can be splitted in at most H intervals 
with sizes where £j^y € {1, . . . , iL}. This gives at most possibilities for 

a large task. The total number of different ways Gi G Gi to schedule these tasks 
is bounded by a constant g := (1 + . Now we can describe our 

objective function as follows: 

r r 

’^^WjRz(j) = kF(i, Gj, Gj-i), 

*=1 i=l 

where W{i,Gi,Gi-i) is the total weighted completion time of tasks that com- 
plete in block Bi corresponding to ways Gi and Gi-i (we use the fact that in 
block Bi only tasks that are released in Bi and Bi-i can be scheduled). 

The dynamic programming table entry 0(i,Gi) stores the minimum weighted 
completion time among all restricted schedules, where: 

(1) Gi represents the way in which the tasks released in Bi are scheduled in the 
intervals of block Bi and Bi+i, and 

(2) all tasks that are released before block Bi are completely finished. 

To compute the table Oi we use the following recursive equation: 

r W(l,Gi,-), GiGGi fort = 1; 

0{i,Gi)= < min(3._jgg._j [0(f — 1, Gi_i) -|- VF(i, Gj, Gi_i)], 

[g^ G Gi ior i = 2, . . . ,r. 



Lemma 7. The time to compute the table 0{i) for all 1 < i < r can he hounded 
by 0(ji) - T{W,n) where T{W,n) denotes the maximal time to compute the func- 
tion W for one triple (i, Gi,Gi_i). 

In the following we describe the procedure to test the feasibility for the set 
of tasks defined by Gi and Gi-i to be scheduled in Bi and to compute the 
value W{i, Gi, Gi-i). First there are 0(s) intervals in Bi. Using the information 
from Gi and Gi_i we know precisely the finishing interval Iz(j) of each task 
j G BTi_i U 13'T'i. Thus if Gi cuid Gi—i give us u fcusiblc schedule then we 
can compute the value W directly. To test the feasibility we use the following 
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idea. Consider the intervals in Bi. For each interval U^t, t = 1, . . . ,0{s) in Bi 
we compute the set Vt of tasks that are processed in and the set Pvt of 
processing times that are used to process these tasks. Then we have to verify 
whether the set Vt with can be scheduled in li^t for each t = 1, . . . , 0{s). In 
total, the problem of testing is equivalent to a sequence (of constant length 0(s)) 
of problem instances of Pm\fiXj,pmtn\Cma_x- Each such problem can be solved 
in linear time optimally with respect to the number of tasks in the instance [2] . 
Since the number of tasks in each set Vt, t = 1, . . . , 0{s) is at most the number 
of tasks defined by Gi and Gt-i (this number is constant because there are only 
0(1) tasks released in Bi and Bi-i), we have obtained the following result: 

Lemma 8. Given Gi-i and Gi, the feasibility test and the computation of the 
value W{i,Gi,Gi-i) can he done in 0(1) time. 

This Lemma implies that the time to compute the entire table is bounded by 
0{n) and that our algorithm for the preemptive variant runs in O(nlogn) time. 
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Abstract. Evidence indicates that members of many gene families in 
the genome of an organism tend to have homologues both within their 
own genome and in the genomes of other organisms. Amongst these ho- 
mologues, typically only one or a few per genome perform an analogous 
function in their genome. Finding subsets of these genes which show 
evidence of performing a common function is an important first step to- 
wards, for instance, the creation of phylogenetic trees, multiple sequence 
alignments and secondary structure predictions. 

Given a collection of taxa P = {Pi, P 2 , ■ ■ ■ ,Pk} where Pi contains genes 
{pi,i,Pi, 2 , ■ ■ ■ ,Pi,rii}, w6 ask to choose one gene from each of the taxa Pi 
such that these chosen vertices most agree. We define most agreeing in 
three distinct ways: most tree-like, pairwise closest, and pairwise most 
similar. 

We show these problems to be computationally hard from almost every 
angle via classical, parameterized and approximation complexity theory. 
However, on the positive side, we give randomized approximation algo- 
rithms following ideas from [GGR98] for the pairwise elosest and pairwise 
most similar variants. 



1 Introduction 

Given a new nucleo- or peptide sequence, the standard “first step” of any inquiry 
into the determination of the evolution, chemical properties, and (ultimately) 
function of this biomolecule is to align it against every entry in a large molecular 
dataset such as EMBL[S99] or SwissProt[BA]. Since properties such as function 
are extremely complex and still largely unknown, no simple search of a dataset 
can answer these questions directly. The standard alignment tools [AGMWL90, 
PL88] only return entries which show statistically significant signs of pairwise 
evolutionary relationships. The end result is that many of the returned sequences 
will belong to gene families other than the family of our new sequence. 

There are many reasons why this is the case such as partial domain agreement, 
long distance homology and parology via gene duplications and losses. We refer 
readers to [B99], [BDDEHY98], [GGMRM79], [KTG98], [MMS95], [P98], [PG97], 
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[S99], [SM98], [TKL97], [YEVB98] and the authors’ paper [HL99a] for a more 
thorough treatment of the problems, models, and experimental results. 

In any study of evolution, chemical properties, and function, care must be 
taken to use sequences that are all pairwise homologous (all related by a common 
evolutionary ancestor) and that all perform an analogous function^ in their re- 
spective genome. When such care is not taken in the selection of sequences, gene 
trees will not reflect the true evolutionary relationships of the species, multiple 
sequence alignments will not display regions of conservation and change, and 
predictions of secondary structure will be inaccurate [B92, BDDEHY98, F88]. 

We introduce the following model of the above selection problem. A collection 
of sets P = {Pi, P 2 , ■ ■ ■ , Pk} is given where Pi corresponds to taxa i and contains 
the homologues {pi^i,P 2 ,i, ■ ■ ■ )Pni,i} found in the genome of taxa i. The goal is to 
choose one gene from each of the Pi such that these genes agree the most. Such 
a subset is refered to as a core of the weighted fc-partite graph. We introduce 
three distinct definitions of most agreeing: most tree-like, pairwise closest, and 
pairwise most similar. 

Most Tree Like in a /c-Partite Graph (Core-Tree) 
input: A complete /c-partite graph G = (Pi, P 2 , ■ ■ ■ ,Pk,E), edge weights 
w : P ^ M. 

output: A set P' = {pi,p 2 , ■ ■ ■ ,Pk} where pi G Pi such that \\D{P') — 
A{D{P'))\\z is minimized where D{P') is the distance matrix formed 
in the obvious way from P' and A{D{P')) is the closest additive ap- 
proximation to D(P') under the Lz norm for some z G {1,2, . . . , 00 }. 

That is, one vertex (one gene) is selected from each partition (each genome) 
such that the distance matrix formed from the pairwise comparisons of the genes 
is as close to additive (as close to “tree-like”) as possible. The assumption behind 
this optimization criteria is that genes, which have a different function (hence, 
a significantly different underlying sequence) than the gene family, should intro- 
duce non-additivity when placed into a distance matrix consisting of genes from 
the gene family. 

Assume we are given a set of homologous genes where some are functionally 
analogous and others are effectively functionally inactive. The functionally in- 
active copies of the gene should drift in a random direction through the amino 
acid “space” whilst the functionally active genes in our family should mutate 
relatively slowly. Therefore the genes performing analogous function should be 
identifiable by being mutually similiar or closer in distance than any other homo- 
logues. Furthermore, sequences which have domains foreign to the gene family 
will also induce distance measures significantly greater than pairwise measure- 
ments between members of the gene family. We arrive at our second and third 
notions of most agreeing: 

^ We say analogous function here and not simply function to stress that the role a 
specific gene in a family plays is almost never exactly the same between organisms. 
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Minimum Weight Clique in a /c-Partite Graphs (Core-Clique) 
input: A complete /c-partite graph G = {Pi, P2, . . . ,Pk,E), edge weights 
w : ill ^ M. 

output: Aset P' = {pi,P2, ■ ■ • ,Pk} such that pi G P* and Si<i^j<kw{pi,pj) 
is minimum. 

Note that the edges between vertices in different partitions could correspond 
to either (1) an estimate of the distance between the two genes, or (2) a statis- 
tical measure of similarity (eg. a maximum likelihood score). The first variant 
induces a minimization problem whilst the second variant induces a maximiza- 
tion problem. In most cases, the behaviour of either problem is the same and 
thus we focus attention on the former. Note also that the gene family is not 
assumed to have any sort of nice “tree-like” behavior. This problem may be 
particularly suited to studying microbial taxa as it is becoming clear that gene 
and species phytogenies are often tentative at best. 

In the remainder of this paper we show that choosing cores under any of these 
optimization criteria is hard from the classical, parameterized and approximation 
complexity frameworks. That is, the general versions of these problems are NP- 
complete and hard for complexity class W[l] for versions of the problem when 
the number of partitions, the size of each partition, the maximum weight of 
an edge, or the overall weight of the core are parameters. We also show that 
all of these problems are not approximable within a polynomial function of n 
in polynomial time. On the positive side, we give a randomized approximation 
algorithm using ideas from [GGR98, RS96] for these last two problems. For a 
confidence parameter S and a accuracy parameter e, this algorithm will correctly 
find a core-clique of weight opt + ecr ■ with probability 1 — S/2, where opt is 
the optimal weight core clique in the input graph, k is the number of partitions 
and a is the maximum difference between the weight of two edges adjacent to 
the same vertex. 



2 Background 

Trees and Graphs A phylogenetic tree T = (V, E) is a binary connected acyclic 
graph. A leaf in T has degree 1 and Lt is used to denote the subset of V which 
contain the leaves of T. For S' C T, we let T[S] represent the subtree of T induced 
by S. A weighted phylogenetic tree is a phylogenetic tree with a weight function 
associated with the edges, T = {V, E, w) where w : Et [0, 00 ). A complete k- 
partite graph is {k + l)-tuple P = (Pi, P 2 , • • • , Pk, E) where Pi contains vertices 
{Pi,i,Pi, 2 , ■ ■ ■ ,Pi,ui} for some where Pi n Pj = 0, and where E, the edge set, 
contains edges between every two vertices in two different partitions Pi and Pj. 
Weighted /c-partite graphs are defined similarly. A clique of size t in a graph G 
is a set of t distinct vertices which are mutually adjacent. The weight of an edge 
is written w{x,y) as a short hand for w{{x,y)) for some edge {x,y). 

Distance/ Similarity Matrices A distance matrix P is a 0 diagonal, symmet- 
ric, nonnegative matrix, indexed by the set of taxa Lt for a phylogenetic tree 
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T where the entry Dij is the distance (an estimated distance) between taxa i 
and taxa j. An n x n distance matrix D is additive, if there exists a weighted 
phylogenetic tree T with n leaves such that entry Dij equals to the sum of the 
edge weights in the tree along the path connecting i and j. A similarity matrix 
S is the same as a distance matrix except that diagonal elements have value oo 
and entry Sij is a similarity score between taxa i and j . 

Theorem 1 ([B71]). A matrix D is additive if and only if for all i,j, k, I (not 
necessarily distinct), the maximum of Dij + Dki,Dik + Dji,Du + Djk is not 
unique. The edge weighted tree (with positive weights on internal edges and non- 
negative weights on leaf edges) representing the additive distance matrix is unique 
among the trees without vertices of degree two. 

Error Measurements The Lk norm between distance matrices D and D' , 
written ||_D — U'||fc, is defined as \\D — D'\\k = {Di.^j (|Uy — for fc > 1. 

For k = oo, the L^o norm is defined as ||ZJ — U'||oo = maxi^j \Dij — Ubj 
Approximation Ratios An approximation algorithm is said to achieve an 
approximation ratio of a for a maximization problem U if for each input x, it 
computes a solution y of cost at least OPT /a, where OPT is the cost of the 
optimum. For a minimization problem, the algorithm must return a solution y 
of cost at most a ■ OPT. Note that a > 1. 

We refer the reader to [DF99] for a complete description of parameterized 
complexity. The following three items are the main ingredients of this tool. 
FPT, Completeness, Reductions (1) For a parameterized language L, L C 
E*xS* : {x, k), {k is the parameter) we say that L is (uniformly) fixed parameter 
tractable {FPT) if there exists a constant a and an algorithm such that 
decides if {x, k) € L in time /(fc)|a;|“ where / : IN ^ IN is an arbitrary function. 
(2) We say that L reduces to L' by a standard parameterized m-reduction if 
there is an algorithm <I> which transforms {x,k) into {x',g{k)) in time f{k)\x\°‘, 
where /, g : IN ^ IN are arbitrary functions and a is a constant independent of 
k, so that {x,k) G L if and only if {x',g{k)) G L' . (3) /c-Clique, parameterized 
by the clique set size k, is complete for complexity class W[l]. That is, /c-Clique 
is not in FPT unless problems like fc-SxEP Turing Machine and many other 
problems whose best known algorithms run in time E(nf), can be solved in FPT 
{f{k) ■ n“) time. 



3 Complexity Results 

3.1 Core-Clique. The decision version of this problem takes as input a param- 
eter r G IR and answers “yes” iff the core-clique has weight < r. Theorem 2 below 
states that even when the number of candidate genes per genome is bounded by 
3, an extremely simple weighting function is used, and a bound of 0 is placed 
on the size of the core-clique, the problem remains AP-complete. Theorem 3 
states that a modified (easier to approximate) version of Core-Clique cannot 
be approximated within any function of n (the number of vertices of the input 
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graph) in polynomial time. Both these theorems follows easily from the following 
lemma. 

Lemma 1. Let f{n) he a funetion such that f{n) > 0 for all n > 1 , then Core- 
Clique restricted to partitions of size 3 and with a weighting function w which 
assigns an edge either 0 or f{n), and r = Q is N P -complete, where n is the size 
of the input graph. 

Proof. The problem is in NP. To show hardness, we reduce from 3SAT which ac- 
cepts as input a formula 'P in 3-CNF over a set of variables X = {xi,X 2 , ■ ■ ■ , Xt}, 
and asks if there is a truth assignment to X such that each clause of P has at 
least one true literal. 

Let X = {x\,X 2 , ■ . ■ , xt\ be the set of variables and C = {C\,C 2 , ■ . ■ , Ck\ be 
the set of clauses of an arbitrary instance of this problem. To construct an in- 
stance of the Core-Clique problem (G, w, r), we create k partitions Pi,P 2 , ■ ■ ■ , 
Pk and associate Pi with clause Ci. The 3 vertices in Pi are labeled by the lit- 
erals in Ci. The weight of an edge between two vertices in different partitions 
corresponding to two negated literals Xj and Xj is f(n). Otherwise, the weight 
is 0. 

Claim C has a weight 0 core-clique if and only if P is satisfiable. 

(=^) Let p^ ,p^ , . . . ,p^ be the set of vertices which induce a core-clique of weight 
0. Now there can be no weight f(n) edges between any p* and pp which implies 
that it is never the case that p* is some literal x whilst pp is the negated literal 
X. Hence, we may set the literal p* to be true. Since we may do this for all k of 
the partitions, we have a truth assignment for P with at least one true literal in 
each clause. 

(4=) Let T : X ^ {true, false} be a truth assignment to P such that at least 
one literal x in each clause Ci is true. Consider any two distinct such literals Xi 
and Xj which are true in clauses Ci and Cj. Then the vertex labelled Xi in Pi 
and the vertex lapelled Xj in Pj have no weight f(n) edge between them, since 
T is a satisfying assignment for P and there is an edge of weight f(n) only if 
two literals are negations of each other. Hence, we may place Xi and Xj in the 
core-clique. 

Theorem 2. Core-Clique restricted to partitions of size 3 and with a weight- 
ing function w which assigns an edge either 0 or 1, and r = Q is N P -complete. 

No minimization problem for which it is fVP-complete to distinguish be- 
tween instances with 0 minimum cost and instances with cost c > 0 can be 
approximated within any ratio in polynomial time. Since this comment applies 
to the Core-Clique problem, we formulate a slightly modified version of 
the optimization form (Modified-Core-Clique) of the problem which asks 
for the P' which minimizes 1 -I- N'i<i<j<fcw(pi,pj), for which non-trivial non- 
approximability results can be proved. 

Theorem 3. If P ^ NP, then Modified-Core-Clique is not approximable 
within any function of n in polynomial time, where n is the size of the input 
graph. 
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Proof. Assume that Modified Core-Clique can be approximated in polyno- 
mial time approximated to within a function g(n). It follows immediately that 
g{n) > 1 for all n > 1. By Lemma 1, it is NP-hard to distinguish between 
instances of Modified Core-Clique with a minimum of 1 and those with a 
minimum of 1 -|- g{n). However, using the assumed approximation algorithm it 
is possible to distinguish between such instances. From this contradiction the 
theorem follows. 

Next we examine the Core-Clique problem from the perspective of pa- 
rameterized complexity (see § 2 and [DF99]). The main principle here is that, 
although the general form of the problem is NP-complete, our reduction does 
not disclose exactly where the source of intractability lies. We see at least the 
following four possible parameterizations of the problem: (1) m = max\fi\Pi\, 
the maximum size of a partition, (2) k, the number of partitions, (3) r, the total 
weight of the core-tree, and (4) ui, the maximum weight of a distance between 
two vertices. Note, Theorem 2 shows that any subset of parameters 1, 3 and 4 
are not enough as the problem remains NP-complete. Our next theorem rules 
out the possibility of an FPT algorithm for any subset of parameters 2,3, and 4. 

Theorem 4. 2, 3, 4-Core-Clique is hard for VF[1]. 

Proof. Let (C = (V, E), K) be an instance of the JL-Clique Problem. We con- 
struct an instance of the Core-Clique problem (G = (Pi, P 2 , • ■ • , Pk, E),w, r), 
where r, lu, and k are functions depending only on K and show that (G, K) is a 
“yes” instance if and only if (G, w, r) is a “yes” instance. 

Let the vertices in Vc be labeled by 1, 2, . . . , iVcl = rn. Let r = We create 
partitions Pi, P 2 , . . . , PK=k and include vertices labeled pij for 1 < j < m in 
partition Pi. We place an edge between all vertices in G which are not in the 
same partition: for all i,j, ^ < i < j < k, and for all q,q' , ^ < q < q' < m, 
(Pt,g>Pj,g') G Eq. If (u,v) ^ Ec, then w{pi^u,Pj,v) = c for all I < t < j < /c. c 
is an arbitrarily large constant at least as big as (^) -I- 1. If {u,v) G Eq, then 
w{pi,u,Pj,v) = 1 for all I < i < j < fc. For all edges of the form (j>i^u,Pj,u) G Eq, 
let w{pi^u,Pj,u) = c. 

We omit the remainder of the (straightforward) argument due to space lim- 
itations. 

Observe that 1, 2 -Core-Clique is fixed parameter tractable with an algo- 
rithm running in time 0{mf). We simply try all 0{mf) possible ortho-sets. 

Theorem 2 shows that the problem remains hard for partition size 3 with 
constant edge weight functions and a constant bound on the core-clique. Our 
next theorem shows that restricted to partition size 2 and constant edge weight 
functions it still stays hard. 

Theorem 5. 1, 4-Core-Clique is N P-complete even when the number of ver- 
tices in each partition is at most 2 and the edges are assigned a weight of either 
0 or 1. 

Proof. Reduction from the Maximum 2SAT problem omitted. 
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3.2 Most Tree Like. We restrict our attention to the Loo norm throughout 
the following analysis, but note that our reductions also work for other norms. 
Clearly, the decision version of the Core-Tree problem, which asks if there is 
a P' such that \\D{P') — A{D{P'))\\oo < A for input parameter Z\ G M, is NP- 
complete since Numerical Taxonomy [ABFNPT96] ^ is simply a restricted 
version (specifically, all partitions having size 1) of it. We begin our analysis with 
a sub-version of the problem where we ask if there exists a choice of one leaf from 
each partition in the input graph that induces an additive tree. Furthermore, we 
are given the unweighted topology of the tree, so the problem reduces to just 
choosing one vertex per partition so that the pairwise distances fit to the tree. 
This problem, when each partition just has a single vertex, is not TVP-complete 
[F88]. 

Exact Tree in a /c-Partite Graph (Exact-Core-Tree) 
input: As with Core-Tree but also an unweighted leaf-labeled tree T 
with each leaf receiving a distinct label from {Pi, P 2 , ■ . . , Pk}- 
question: Does there exist a set P' = {pi,p 2 , ■ ■ ■ ,Pk} where pi G Pi such 
that D{P') is additive, where D{P') is the distance matrix formed 
from P', and such that the corresponding tree T{D{P')) is isomorphic 
to T and for u G T{D{P')), u G Pi, the corresponding leaf in T has 
label Pi- 

Again, we analyze this problem from the perspective of parameterized com- 
plexity. Our parameters remain the same: (1) to = maxyi\Pi\, the maximum 
size of a partition, (2) k, the number of partitions, (3) r, the total weight of 
the core-tree, and (4) w, the maximum weight of a distance between two leaves. 
Our first theorem shows that no FPT algorithms are possible for any subset of 
parameters 2, 3, or 4, unless W[l] = FPT. 

Theorem 6. 2, 3, 4-Exact-Core-Tree is hard for W[l], 

Proof. Given an instance of the AT-Clique Problem (G = {V, E), K), we create 
an instance of the 2, 3, 4-Exact-Core-Tree problem (G,T) and show that 
(C,K) is a “yes” instance if and only if (G,w,T) is a “yes” instance. 

We construct K + 4(= k) partitions {A, B,C, D, Pi, P 2 , . . . , Pk}- Partition 
A contains one vertex a, B contains b, G contains c, and D contains d. Each 
partition Pi contains iVcl = lu vertices labeled Pi,i,Pi, 2 , ■ ■ ■ ,Pi,m- Our tree T 
is created as in Figure 1: the caterpillar with {A,B) and {C,D) as its “head” 
and “tail”. That is, our tree has internal vertices {h,t,ni, . . . ,riK} with edges 
{{h, A), {h, B), ft, C), ft, D), {h, ni), ft, uk)} and {(n*, rii+i) : l<i < K}. 

Let Da.b = Dc,d = 2, Da,c = Da,d = Db,c = Db,d = 4:+{K —1). Let = 

2 + i for X = {a,b}, 1 < i < K and 1 < j < to. Let Dy^p^ . = 2 + (K — i + 1) 
for y = {c,d}, 1 < i < K and I < j < m. Let Dp..^p., ^ = 3K -|- 10 for all 
1 < z yf F < AT and 1 < j < to. If (zz, u) ^ Ec, then „ = 3AT -|- 10 for all 

1 < z yf z' < AT. If (zz, v) G Eq, then for 1 < z < j < A', Dpi^„^,pj^„ = 2 + j — i. 

^ Numerical Taxonomy, input: An n x n distance matrix D , a bound A G IR. 
question: Is ||A(P) — P||oo < A? 
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(=i>) Let V = {v\,V 2 , ■ • ■ , vk} where Vi &Vc & clique in C. We show how to 
choose one vertex from each of the Pi in G such that the distance matrix formed 
from these vertices alongside with a,b,c and d is additive. Note that we must 
choose a, b, c and d, and that the distance matrix these four vertices induce is 
additive (see Theorem 1) and agrees with the topology T. 

Now consider the set of vertices {pi,vi,P 2 ,v 2 J ■ ■ ■ ,Pk,vk} = P' From the 
construction, = 2 + j — z as any two distinct vertices Pi^vi,Pj,vj from 

this set are mutually adjacent. We must show how weights can be applied to 
the edges of T such that the distances in T between and Pj,vj, d,{pi^y^,pj^v^) 
are equal to the entries Dp.^, . This can be accomplished by assigning 1 to 

every edge on the path between pi^y. and pj^y^ in T. It is easy to verify that 
drixjPi^yJ = for x G {a,b,c,d} and that the matrix can be realized as 

a tree. 

( 4 =) Let P' = {a,b,c,d,pi,p 2 , - ■ ■ ,Pk} be the set of vertices from G which 
induces a tree with topology T. By Theorem 1, the underlying distance matrix 
D is additive. For a leaf vertex x, let n{x) be the unique neighbour of x in T . 
Focus on the four vertices {a, 6, c, d}. By Theorem 1, the edge weights in this 
subtree must be 1 for edges of the form (x,n(x)) where x G {a,b,c,d}. The 
weight of the path between (a, b) and (c, d) receives weight 2 + {K — 1). We now 
analyze the “choice” of vertices {pi,P 2 , ■ ■ ■ ,Pk}- 
Claim: [No Fit] P' does not contain two vertices pij and Pi'j, i ^ i' . 

(By contradiction) Suppose there exist Pij,Pi',j G P' simultaneously (w.l.o.g. 
z < z'). Then, by the construction, Dp^ .^a = 2 + z. Dp. .^c = 2 + {K — i + 1), 
Dafi = 2 and Dp^ .^p^ . = 3K + 10. Focus on the quartet formed by {a, b,pij, c}. 
It is easy to verify that the edge {pij,n(j>ij)) must have weight 1. Furthermore, 
the path from vertex (AB) to n{pij) must have total weight z and the path 
from vertex (CD) to n{pij) must have weight K — i + 1. The same argument 
holds for the edge weights in the quartet {a,b,piij,c}, that is, the edge weight 
of {Pi' ,j , n{pii j)) is also 1. Allowing n{x) to denote the unique neighbor of a leaf 
vertex x in T, it is easy to verify that the weight of the path from n(pij) to 
n{pi>j) must be i' — i. Since z' — z + 2 < 3K + 10, we reach a contradiction 
since we can not assign edge weights to T so that they agree with the distance 
matrix induced by {a, 5, c, d,pij,pi' j}. Hence, by Theorem 1, this matrix is not 
additive. 

Claim. P' does not contain two vertices pij and Pi'j', i < i' , j yf j, such that 
(vj,Vj,) ^ Ec- 

This claim can be proved in the same way as Claim No Fit above. Simply note 
we assigned Dp^ .^pp to be iK + 10 when {vj,Vji) ^ Eq- 

The previous two claims establish the fact that we must include K distinct 
vertices in G which correspond to pairwise adjacent vertices in C . Hardness for 
W[l] follows from the fact that our construction required only K + 4 partitions, 
all edge weights are a function only of K and the overall weight of the clique-tree 
is also a function only of K . 

Our second theorem shows that this problem is NP-complete even when the 
number of candidate homologous genes per genome is at most 3. 
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Fig. 1. Construction for the 2, 3, 4-Exact-Core-Tree. 



Theorem 7. 1-Exact-Core-Tree restricted to partitions of size 3 is NP- 
complete. 

Proof. Proof omitted due to space limitations but uses many of the same ideas 
from the proof above. We reduce for 3SAT and map each clause to a partition 
with extra partitions A, B, C, D as above. 

Parameterizing on both the number of partitions k and the size of each 
partition m leads to a trivial FPT algorithm for 1, 2 -Core-Tree with a running 
time of 

Consider the relaxation of Exact-Core-Tree to the optimization version 
which asks for the core-set P' which best fits to the topology T and we modify 
this optimization criteria so that it is always > 0, we can prove the following 
non-approximation results via Theorem 7: 

Theorem 8. The always positive, optimization version o/ E xact-Core-Tree 
is not approximate within any function of n in polynomial time, where n is the 
size of the graph G, unless P = NP. 

Proof. Similar to Theorem 3. 

4 A Randomized Approximation Algorithm for the 

Core- Clique Problem 

Following [GGR98] , we will now give a randomized approximation algorithm for 
the Core-Clique problem. The algorithm runs in linear time if each Pi has size 
bounded by a constant m, and polynomial time in the general case. Let a{G, w) 
denote the maximum difference between the weights of two edges adjacent to a 
vertex v, over all vertices u of G and its adjacent edges. 

Theorem 9. For any e,S G (0,1); there is a randomized algorithm for the 
Core-Clique problem that for a given instance G, w with probability > 1 — 5/2 
in polynomial time finds a solution of cost < c* + ea{G,w)k^ , where c* is the 
cost of the minimum cost core-clique. 
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Consider a given Core-Clique instance G,w and let a = a{G,w). Let e, our 
distance parameter, be such that 0 < e < 1 and S, our confidence parameter, be 
such that 0 < (5 < 1. We use [k] to denote the set {1,2,..., k}. 

Let I = [8/e] and t = 6>(^log^). Consider a partition of [k] into I sets 
Ax,...,Ai of approximately equal size. Let Vj = Ui^AjPi and Wj = V{G) \ Vj. 
For U = U\, ... ,Ui where Uj C [k] \ Aj, let X(Uj) be the family of all X C Wj 
such that \X C\ Pi\ = 1 for all i € Uj and X C\ Pi = % ior i ^ Uj, and let 
X{U) = {{Xu . . . ,Xi) : Xj G X{Ui),i = 1 . . .1}. 

Algorithm Randomized A 

1 . Choose U = Ui, . . . ,Ui where Uj has size t and is chosen uni- 
formly in [k] \ Aj . 

2. For each A G X{U) 

3. Let 

= {argmm„^p.w{v , Xj) : I < j < l,i £ Aj}. 

4. Output the core-clique which has minimum weight over all 
A G X{U). 

We will denote the minimum cost core-clique by O*. 

Lemma 2. With probability 1 — S/2 over the choice ofU there is an X £ X{U) 
such that w{0^) < w{0*) + eakfi /2. 

Proof. The proof is omitted due to space limitations, but very similar proofs can 
be found in [GGR98]. 

Algorithm Randomized B 

1 . Choose U = Ui, . . . ,Ui where Uj has size t and is chosen uni- 
formly in [A:] \ Aj . 

2. Uniformly chose a subset G = (ci, . . . , c^} of even size 

^( iMogm+log(l/^) ) 

3. For each A G X{U) 

4. For each i £ G, let 

vf = argmin^gp. Aj). 

5. Output the tuple A which minimize 

r/2 

i=l 



over all A G X{U). 
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The final version of our algorithm does the following. It computes a tu- 
ple X using Algorithm Randomized B and then outputs the core-clique O = 
{argmin^^P^w(v, Xj) : 1 < j < I, i € Aj}. Since 

r/2 

i^\ 

has expected value w{0^)/k'^, it follows that 

r/2 

Prc[|2^u;(4_i,4)/r-u;(0^)A2| > ea/4] < 

i^l 

Since \X{U)\ < it follows that 

r/2 

Prc[yx e X{U),\2J2w{v2^-l,V2i)/r -w{0^)/e\ < ea/4] >1-^2. 

i=l 

Discussion Our experimental results for a family of suspected Rubredoxin 
proteins suggest that the Core-Clique optimization critera does in fact allow 
us to distinguish families of analogously functioning genes. 

All of the algorithms mentioned in this paper have been implemented and 
tested. We note that our randomized approximation algorithm performs best 
when the input graph is quite large. We have also tried a number of greedy and 
randomized greedy heuristics for these problems and we have found that these 
simple heuristics tend to out-perform our randomized approximation algorithm 
in practice. There are a number of ways that ideas in the approximation algo- 
rithms can be used to derive more advanced heuristics (dominating the simpler 
ones) and possibly more practical algorithms with proven performance bounds. 
This is certainly a very challenging line of research that needs further consider- 
ation. 



References 

[ABFNPT96] Agarwala, R. et. al. (1996) On the approximability of numerical taxon- 
omy. In: Proceedings of the Seventh Annual ACM-SIAM Symposium on Diserete 
Algorithms, 365-372. 

[AGMWL90] Altschul, S. F. et. al. (1990) Basic local alignment search tool. J. Mol. 
Biol, 215, 403-410. 

[BA] Bairoch, A. and Apweiler, R.(1999) The SWISS-PROT protein sequence data 
bank and its supplement TrEMBL in 1999. Nuc. Acids Res., 27, 49-54. 

[BSFDHWOO] an Beilen, J. B., Smits, T., Franchini, A., Disch, T., Hallett, M., With- 
olt, B. (2000) Two types of rubredoxins involved in alkane oxidation. To be sub- 
mitted to Gene. ETH Zurich. 

[B92] Benner, S. A. (1992) Predicting de novo the folded structure of proteins. Current 
Opinion in Struetural Biology, 2:402-412. 




476 Michael T. Hallett and Jens Lagergren 



[B99] Benner, S. A. (1998) Personal communication. 

[BDDEHY98] Bork, P. et. al. (1998) Predicting function: from genes to genomes and 
back. J. Mol. Biol. 283, 707-725. 

[B71] Buneman, P. (1971) The recovery of trees from measures of dissimilarity. 
In: Mathematics in the Archaeological and Historical Sciences, F. R. Hodson, 
D. G. Kendall, P. Tauto, eds.: Edinburgh University Press, Edinburgh, 387-395. 

[DF99] Downey, R. and Fellows, M. R. (1999) Parameterized Complexity. Springer 
Verlag, New York. 

[F88] Felsenstein, J. (1988) Phytogenies from molecular sequences: inference and reli- 
ability. Annual Revue of Genetics, 22, 521-565. 

[GGR98] Goldreich et. al. (1998) Property testing and its connection to learning and 
approximation. J. of the ACM, 45:4, 653-750. 

[GCMRM79] Goodman, M. et. al. (1979) Fitting the Gene Lineage into its Species 
Lineage: A parsimony strategy illustrated by cladograms constructed from globin 
sequences, Syst.ZooL, 28. 

[GMS96] Guigo, R. et. al. (1996) Reconstruction of Ancient Molecular Phytogeny. 
Molec. Phylogenet. and EvoL, 6(2), pp. 189-213, 1996. 

[HL99a] Hallett, M. T. and Lagergren. J. (1999) Hunting for Functionally Analogous 
Genes: Cores of Partite Graphs (Full Paper). (1999) Tech. Report ETH Zurich, 
No. 327. 

[HL99b] Hallett, M. T. and Lagergren. J. (2000) New Algorithms for the Duplication- 
Loss Model. RECOMB ’00, Tokyo, Japan, p. 136-148. 

[H63] W. Hoeffding. (1963) Probability inequalities for sums of bounded random vari- 
ables. Amer. Statist. Assoc. J., 58, 13-30. 

[KTG98] Koonin, E. V. et. al. (1998) Beyond complete genomes: from sequence to 
structure and function. Curr Opin Struct Biol, 8(3), 355-63. 

[MMS95] Mirkin, B. et. al. (1995) A biologically consistent model for comparing molec- 
ular phytogenies. Journal of computational biology, 2(4), 493-507. 

[P98] Page, R. (1998) GeneTree: comparing gene and species phytogenies using recon- 
ciled trees. Bioinformatics, 14(9), 819-820. 

[PC97] Page, R. and M. Charleston, M. (1997) From Gene to organismal phytogeny: 
reconciled trees and the gene tree/species tree problem. Molec. Phyl. and Evol. 7, 
231-240. 

[PL88] Pearson, W. R. and Lipman, D. J. (1988) Improved tools for biological sequence 
comparison. Proc. Natl. Acad. Sci., 85:2444-2448. 

[RS96] Rubinfeld, R. and Sudan, M. (1996) Robust characterization of polynomials 
with applications to program testing. SIAM J. Comput. 25, 2, 252-271. 

[SN87] Saitou, N. and Nei, N. (1987) The neighbour-joining method: a new method 
for reconstructing phylogenetic trees. Mol. Biol. Evol., 4, pp. 406-425, 1987. 

[SM98] Slonimski et. al. (1998) The first law of genomics. Abstract “Microbial Genomes 
II”, Hilton Head, January. 

[S99] Stoesser, G. et. al. (1999) The EMBL Nucleotide Sequence Database. Nuc. Acids 
Res., 27(1), 18-24. 

[TKL97] Tatusov, R. L. et. al. (1997) A genomic perspective on protein families. Sci- 
ence, 278(5338), 631-7. 

[YEVB98] Yuan, Y. P. et. al. (1998) Towards detection of orthologues in sequence 
databases. Bioinformatics, 14(3), 285-289. 




Keeping Track of the Latest Gossip in Shared 
Memory Systems 



Bharat Adsul, Aranyak Mehta, and Milind Sohoni 



Dept of Computer Science & Engineering, 

Indian Institute of Technology, Mumbai 400 076, India 
{abharat, aranyak, sohoni}@cse . iitb. ernet . in 



Abstract. In this paper we present a solution to the ‘Latest Gossip 
Problem’ for a shared memory distributed system. The Latest Gossip 
Problem is essentially one of bounded timestamping in which processes 
must locally keep track of the ‘latest’ information, direct or indirect, 
about all other processes. A solution to the Latest Gossip Problem is 
fundamental to the understanding of information flow in a distributed 
computation, and has applications to problems such as global state de- 
tection and mutual exclusion. Our solution is along the lines of that for 
message passing systems in [6], and for synchronously communicating 
systems [8]. 

Our algorithm uses a modified version of the consume and update proto- 
cols of Dwork and Waarts [3], where these were introduced to construct 
a ‘Bounded Concurrent Timestamping System (BCTS)’. As applications 
of our Gossip Protocol, we also indicate another construction of a BCTS 
and a solution to the global state detection problem, which, we believe, 
are improvements over older solutions. 



1 Introduction 

Consider a distributed system of processes which communicate through protocols 
utilizing shared memory. The online latest gossip problem for this system is the 
following: 

Whenever a process q reads the information written by another process 
p, q should be able decide which of p and q has more recent information, 
direct or indirect, about r, for every other process r in the system. 

Once q makes this decision, it can systematically collate and update this infor- 
mation to maintain, on-line, its ‘latest gossip’ about every other process. The 
latest gossip provides crucial information to each process about the unfolding of 
the global computation. 

Note that there is a trivial solution to the latest gossip problem if we allow 
unbounded labels. For example, each process could label its writes from the set 
of integers, labeling its i-th write operation with the number i. This unique 
labeling of the writes also reflects their temporal order. Whence the process q, 
on reading p, may compare the r-label that p holds, with the r-label that q 
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holds and update its latest information on r. Unfortunately, as time progresses, 
the labels increase in size without bounds. To eliminate this problem we need 
an unambiguous labeling of the writes from a finite set of labels. This requires 
a careful reuse of ‘old’ labels and a method of comparing ‘latest information’ 
using these labels. In our solution to the problem, with each write operation, 
processes attach a bounded ‘gossip’ information. The ‘gossip’ at each process 
is a selection of a finite number of operations in its recent past arranged in a 
suitable structure, and constitutes the bounded timestamp. 

Bounded timestamping protocols to solve the latest gossip problem have been 
exhibited for other distributed models. In [8] the authors introduce the latest 
gossip problem and present a solution in a distributed system where processes 
use handshake communication to synchronize and exchange information. This 
solution has also been extended to the message-passing model in which pro- 
cesses exchange information through messages sent from process to process [6]. 
However, there the underlying computation is restricted to that in which, at ev- 
ery instant, the number of ‘unacknowledged messages’ is uniformly bounded. As 
shown there, this requirement may be implemented by each process by waiting 
for acknowledgments to come in. It is argued in [7] that for applications which 
require robust solutions such as mutual exclusion, such wait is essential. 

With the same restriction on computations, the solution in [6] may be adapted 
to the shared memory model. However, in this model, the results of [7] do not 
hold: there are many situations in which computations with unbounded unac- 
knowledged writes, are essential. We present here, protocols for reading and 
writing shared memory such that gossip information may be maintained in the 
general case, without any restrictions on the underlying computation. In other 
words, the underlying computation may run ‘wait-free’, and maintain gossip at 
the same time. However, this is not without cost: Each request by the compu- 
tation for an operation on the shared memory unfolds into a sequence of the 
atomic read and write operations of the underlying system. Furthermore, a few 
additional synchronization variables are required to implement the label. All the 
same, the original computation proceeds wait-free and with identical semantics. 

The protocols by which a computation may read or write into shared mem- 
ory are called Read and Write respectively. These protocols are simple adap- 
tations of those presented by Dwork and Waarts [3] in their construction of a 
Bounded Concurrent Timestamping System (BCTS). In return, our solution in- 
dicates another construction of a BCTS in which we permit any number of hops 
(indirection) in the flow of information. The solution to the BCTS as presented 
in [3] also has the following drawback. Processes label their writes from a pool 
of available timestamps. Once this pool is exhausted, the process performs a 
‘garbage collection’ operation to replenish its pool of usable timestamps. Thus 
a process must stop work every once in a while and perform garbage collection. 
Our modification eliminates the need for garbage collection. Processes can keep 
track of usable timestamps ‘on-line’ during their normal operation in such a way 
that the pool of available timestamps is never exhausted. 
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2 The Model 

We begin by describing the distributed system as an abstract machine. Let V = 
{pi,P 2 , ■ ■ ■ ,Pn} be a set of N processes which collaborate via shared memory. 
The shared memory is segmented into N parts, with the part Mp associated 
with the process p. The ‘public’ memory Mp may be read by any process, while 
it may be written into only by process p. Besides this, each process also has 
some private memory Lp which is operable only by p. The memories Lp and Mp 
will be called p’s local memory, and may be further organized into a sequence of 
variables, say, Mp = {p : Xi, . . . ,p : Xr} and Lp = {p : Yi, . . . ,p : Fg}. We use the 
typewriter font for public variables, and the italic font for private variables. 

The processes are allowed to manipulate their public memory via two atomic 
operations - read and write. The read operation can be performed by a pro- 
cess p to read some public variable of another process q. The value gets written 
into some private variable of p. The write operation enables a process p to copy 
the contents of a local private variable to a local public variable. An operation 
of the machine is the occurrence of an operation in one of the processes. A com- 
putation is a (temporal) sequence of such operations, with obvious semantics. 
For any computation, there is the causality partial order on the operations in 
the computation. For operations e, / in a computation, we say that e < / if the 
outcome of e may affect the outcome of /. The locality of each operation to just 
one or two processes usually makes causality coarser than the total order. 

Each process p runs a program written in a suitable programming language. 
This program may use its local private memory Lp as it chooses. However, 
the local public memory Mp may be accessed by two prescribed protocols: (i) 
Write(p : X,p : Y), which acts to copy the contents ofp : Y G Lp intop : X G Mp 
and (ii) Read(<7 : X), which acts to copy the public variable g : X of another 
process q into the local private memory of p. Any execution of the distributed 
program must result in a computation. In particular, the above protocols must 
be ‘compiled’ into the elementary operations allowed by our machine. 

Our task is to supply protocols for Read and Write which will enable any 
computation arising from any program to correctly compute the latest infor- 
mation, online. The simplest possibility is to replace Read and Write by the 
atomic read and write operations of the machine. However, in this case, un- 
ambiguous timestamping from a bounded set becomes impossible, e.g., consider 
the situation in which a process p performs only Writes and no Reads. Since p 
will never know which Writes have been Read by other processes, it must label 
each successive Write with a distinct label, eventually exhausting its finite set 
of labels. This points to the need for some elaborate handshaking protocols for 
Read and Write. 

3 The Protocol 

We first present our adaptations of the update and consume protocols of [3]. 
Next, we describe the protocols. Read and Write, which use the above consume 
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and update. Henceforth, for simplicity, we assume that the distributed program 
requires only one public variable for each process, viz., p : X. 

3.1 Update and Consume Protocols 

Each process p maintains, for every other process q, the following additional 
public variables: (i) two public bit-variables: p : demand^ and p : supply^, (ii) 
two public variables p : Blpg and p : B2pq, and (iii) a public variable p : kp^. 
These variables are protocol variables used only in the protocol and may not be 
used by the distributed program. 

For a process p, the protocol update(p : X, p : Y) writes the value of a pri- 
vate variable p : Y in the public variable p : X. On the other hand, the protocol 
consume ((7 : X) reads q’s public variable q : X and copies it into p’s private 
memory (into the private variable p : temp). Roughly speaking, the bit demand 
is used to indicate a desire of one process to read the public memory of another, 
and supply, its satisfaction. During an update, if p discovers unsatisfied demand, 
it proceeds to set aside a copy of its public variable. In the consume protocol, 
the process p first raises demand, and then proceeds to read q : X successively. 
If it notices a stable value, then the consume exits, declaring this stable value 
as the contents of q : X. If this fails, then p reads the value set aside by q, and 
declares this as the contents of g : X. The detailed protocols are described below. 



consume(g : X) 

Perform handshake 

1. Read q : supply^, 

2. Write p : demand, = -ig : supply^ 

Remainder of the first Read- Write- Read (RWR) 

3. Read q : X 

4. Write p : Blp, := q : X 

5. Read q : X 

6 . If (the label of) q : X is unchanged since Line 3 then p : temp = p : Blp,, else 

Remainder of the second Read- Write- Read (RWR) 

7. Write p : B2p, := q : X 

8 . Read q : X 

9. If (the label of) g : X is unchanged since Line 5 then p : temp — p : B 2 p, else 

Read Set-Aside 

10. Read g : A,p 

11 . p : temp — q : A,p 



update(p : X, p : Y) 

1. For each q ^ p, read g : demandp 

2. For each g yf p, if g : demcuidp 7 ^ p : supply^ then p : Ap, — p : X 

3. Atomically write p : X = p : Y and for each g, p : supply^ = g : demandp 



Consume and update operations performed by a process p are called p-consume 
and p-update respectively. The label in statement 6 (resp. 9) of the p-consume 
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protocol above refers to labels read in statements 3 and 5 (resp. 5 and 8) and 
which were attached to the quantity q : Xhy process q during its g-updates. This 
labeling is the central construction of this paper. For the moment, we assume 
that if any two of the q : X read in statements 3, 5 and 8 were written in distinct 
g-updates then their labels would also be distinct. Next, note that a consume C 
may have had a ‘successful’ RWR on the information written by an update lA, 
or it may have read the information set aside by a later lA' but of an earlier lA. 
In either case the consume C is said to have succeeded on the update lA. 



3.2 The Read and Write Protocols 

The protocols for Read and Write appear below. Note that each Write con- 
tains exactly one update operation. Whence, we may define the label of a Write 
to be that of the update within. Also, note that the consume and the update 
operations are wait-free - processes can consume and update independent of the 
state of other processes. Whence, the Read and Write are also wait- free. 



REAo(g : X) 


Write(p : X, p : y) 


1. consume(g : X) 


1. For each process q p, 

Read q : Blqp and q : B2qp 
consume(g : X) 

2. update(p :X,p :Y) 



3.3 Causal Order 

Suppose, now, that each process executes a (distributed) program which, besides 
accessing its local private variables, manipulates its public memory by the Read 
and Write operations. Any finite (partial) computation of this program will re- 
sult in a sequence of Read/ Write operations. These operations will henceforth 
be referred as program events, or simply events. The events at the process p will 
be called p-events. 

For any two events e and e', we say that e C e' iff 

— Either: e is a Write g-event and e' is a Read p-event and C succeeds on 
hi, where lA is the update in e and C is the consume operation in e' . In this 
case e' is called an external successor of e. We also say that e' succeeds on e. 

— Or: e is a Write g-event and e' is a Write p-event and C succeeds on lA, 
where lA is the update in e and C is the consume (g : X) operation in e' . 
In this case also e' is called an external successor of e. We also say that e' 
succeeds on e. 

— Or: both e and e! are p-events, and e' is the very next p-event after e. In this 
case e' is called an internal successor of e. 

Define e C e' if e = e' or e C e'. Define C* as the reflexive transitive closure of C. 
The partial order C*, henceforth referred as causal order, records the information 
we require about causality and independence between events. For any event e. 
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we denote by ej, the (partially ordered) set of events e' such that e' C* e. For a 
process p and a p-event Cp, CpJ, is the ‘local view’ of p at Cp. 

Note that the Read and Write calls can be ‘compiled’ into the elementary 
read and write operations using the protocols mentioned above. Hence any 
execution of the distributed program will result in a computation of our abstract 
machine. The causal order C* is really a coarsening of the order < on the atomic 
operations of the abstract machine. Indeed, the order C* between program events 
may be ‘observed’ as the relation < between specific critical atomic operations 
encountered during the execution of those events. We state here two interesting 
properties satisfied by the causal order: 

Regularity If a p-event e succeeds on a g-event e' then there is no Write 
g-event e" after e' which finishes before the consume ((7 : X) in e starts. 
In a sense this means that the Read/ Write events do not read very old 
information. 

Monotonicity If two p-events e and e' succeed on two g-events / and /' re- 
spectively, then if e' occurs after e then f' can not occur before /. 

These properties follow from the corresponding properties of the consume and 
update operations [3]. 

Let p, q be two processes and let e be a p-event. We define latestp^q(e) as 
the C*-maximum g-event / such that / C* e. This is the latest information that 
p has about q after e. Note that latestp^q{e) need not exist if there is no g-event 
in e|. We set latestp^p{e) to e itself. Observe that, for q ^ p, latestp^q{e) is 
always a Write g-event. 

At any point during a computation, let Cp be the last p-event. Similarly let Cq 
be the last g-event. Let e\,e'i denote latestp^r{e-p) and latesiq^ri^q) respectively. 
Suppose that Ci C* e(. Consider a path Ci C 62 • • • C • • • C Cp from ci to Cp. 
Let the next Write operation in the path after Ci be Cfc, an s-event. Then we 
have the following lemma. Refer [1] for the proofs. 

Lemma 1 If ei is maximal (w.r.t. Q* ) in Cpl ne^J,, then et finishes after the 
consume(s : X) in starts. 

4 The Gossip Algorithm 

The Gossip Algorithm requires each process to maintain a pre-specified set of 
events in its recent past. For each process p, besides the events latest p^q, for 
all g, this set contains some additional auxiliary ‘unacknowledged’ events. These 
events and the C*-relationship between them constitute the primary graph. Let 
us suppose that, at the end of a certain computation, the events within the 
primary graphs of each process have been labeled distinctly. If the next event 
is, say an event in which p Reads g, then (i) p may correctly compare its latest 
gossip with that of g’s, and if it is, say a Write event of p, then (ii) provided 
p can discover a suitable label, it may update its primary graph, maintaining 
the distinctness invariant. The hunt for a suitable label is aided by secondary 
information. 




Keeping Track of the Latest Gossip in Shared Memory Systems 483 



4.1 Primary Information 

Let e be a p-event. We denote by latest{e) = Uq^-platestp^q^e), the latest infor- 
mation p has about all processes, at e. 

To maintain and update the latest information of processes, we need to ex- 
pand the set of events that each process keeps track of. This expanded set is 
the primary information. The primary information of a process contains not 
only its latest information about every other process but also information about 
unacknowledged Writes. 

Unacknowledgment Recall that when a process q performs a consume(p : X) 
operation C, it writes twice a temporary value (with a label) in its public memory 
(this is the ‘W’ of the RWRs of the consume). This information is written in 
the public variables Blgp and B2qp. These are the variables that p reads in its 
Write events. Also recall that a process p may, during an update, set aside the 
previous event for another process q in the variable p : Apq. Now given a fixed 
run of Read and Write events we have, for every pair of processes p and q, 
functions i^p,9,B2 and IFp^q^n as follows: 

— For i = 1 , 2 , iFp^q^Bi is a function from the set of all Write p-events to itself. 
For any Write p-event e, iFp^q^Bi{e) is the Write p-event whose label was 
written by q during some g-consume in its q : B±qp variable and read by p 
during e from the q : Bigp variable of q. 

— is a function from the set of all Write p-events to itself. Tp^q^t,{e) is 
the event which was last set aside by p for q at or before e. 

For a process p and for any Write p-event e, we define the set unacklist{e) as 
the set of the following events: 

— For each g yf p, iFp^q, Tp^q^B2(e) and J>,g,A(e), 

— The previous Write p-event, and 
~ The event e itself. 

Note that the size of unacklist{e) is bounded above by 3N, where N is the 
number of processes. 

For a p-event e, we define unackr{e) to be the set unacklist {latest p^r{e)) for 
r ^ p. Define unackp{e) = unacklist{e') where e' is the C*-maximum Write 
p-event in e|, that is, the last Write p-event before e. We set unack{e) = 
Urevunackr{e). Note that unack{e) is defined for Write as well as Read events. 
The following lemma relates the causal order with unacklists. 

Lemma 2 Let e\ and C2 be Write r-events and Cg he a Write s-event for 
some two processes r and s. Suppose ei C* 62 and ei C e^. If it is not that 
&s E* 62 then Cl € unacklist {62) ■ 

The primary information of a process p after an event e consists of latest {e) 
and unack{e). Events in the primary information are called primary events at e. 




484 Bharat Adsul, Aranyak Mehta, and Milind Sohoni 



and the set is denoted by primary (e). Note that the size of primary {e) is bounded 
above by 3 N'^+N, elements in unack{e) and N elements in latest {e). 

To compare and update primary information, processes also need to remem- 
ber how their primary events are ordered by C*. 

Primary graph The primary graph of a process p after an event e, denoted by 
primary - graph {e), is the directed graph (V,E) where: 

— V = the set of primary events at e. 

— For vi,V2 G V, let ci and 62 be the corresponding primary events. Then, 

(wi, U2) G if iff ei C* 62- 

The primary graph is the basic structure which is recursively maintained 
by each process during a computation. We assume that, at any point during a 
computation, the events in the primary graphs of the processes have received 
distinct labels. In other words, if Cp and Cq are two C*-maximal events, and 
e, / G primary {ep)U primary (cq), then e yf / iff label{e) yf label{f). Each process 
p maintains this graph as one on labels and writes it into the public variable p : X 
during every Write event, thus making it available to other processes. 

4.2 Comparing Primary Information 

Let 6 q be a g-event and Cp a Write p-event, such that Cp C e^, that is, Cq is an 
external successor of Cp. Let be the g-event which immediately precedes Cq. 
Let Ip = 6 pl and Iq = e^|. In general, before the occurrence of e^, the processes 
p and g will have incomparable information. The events known to both p and g 
lie in IpDlq. Events lying ‘above’ IpDlq are known to only one of the processes. 

Let us assume that p has computed primary- graph {cp) at the end of Cp, and 
that g has computed primary- graph (e'^) at the end of e^. When g successfully 
reads primary- graphicp), it will have to compare that latest information with 
the latest information it itself knows and has kept in primary- graph{e'q). 

Now if Cp = latest q^pie'q) , then there is nothing to do, as no new information 
has reached g from p in e^. So the interesting case is when Cq succeeds on a newer 
p-event. Our first observation is that if g knows both primary- graphicp) and 
primary - graph {e'q), it can ‘determine’ IpD Iq, the events in Ip which g already 
knew, before Cq. 

Lemma 3 For each maximal event e (yf Cp, e'q) in IpDlq, either e G latest{ep)C\ 
latest(e'q) or e & latest{ep) n unack{e'q) or e & unack{ep) n latest{e'q). 

Thus, when g reads p’s primary graph, g can collect together in a set M all 
the events that lie in latest {ep)C]latest (e'q) , latest{ep)C\unack{e'q) and unack{ep)C\ 
latest(e'q). By the preceding lemma, the events in M subsume the maximal events 
in the intersection Ip H Iq. (It is easy to see that those events in M which are 
not actually maximal still lie within the intersection.) 

Process g can use M to check whether a primary event e G primary {cp) U 
primary {e'q) lies inside or outside the intersection — e lies inside the intersection 
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iff it lies below one of the elements in M. These comparisons can be made using 
the edge information in the graphs primary- graph {cp) and primary- graph (cq). 

Now, it is easy for q to compare the events in latest{ep) with those in latest{e'q) 
to determine which of p and q have more recent information about every other 
process r. 

Lemma 4 Let e = latestp^r{e-p) and f = latest q^ri^'q) such that e yf /. Then, 
eQ+fifff&Iq-Ip. 

Once q has compared all events of the form latestp^r{e-p) and latest g^r{e'q), it 
can easily update its sets unackr{e'q) . The process which has better information 
about r also has better information about r’s unacklist. In other words, q inherits 
the set unackricp) for every process r such that latestp^ri^p) is more recent than 
latest q^rie'q). On the other hand, if latestp^r{e-p) is older than latest q^r{e'q), 
then q ignores p’s set unackr{ep) since it already has better information about 
these events. Furthermore, q has the latest unackq{eq) with it, and does not need 
to update this set. 

At this stage, q has updated its primary information and formed primary (Cq) 
using the information in primary- graph{ep) and primary- graph{e'q) . We now need 
to extend this set to the graph primary- graph (cq). 

Let /i,/2 G primary (Cq). If both fi and /2 came from primary (cp), then 
we add an edge from /i to /2 in primary- graph{eq) iff a corresponding edge 
existed in primary- graph (cp). A symmetric situation applies if both fi and /2 
were contributed by primary [cq). So, the only interesting case is when fi and 
/2 originally came from different processes. Without loss of generality, suppose 
that fi came from primaryicp) and from primary {cq) . Then, from the method 
which we used to compare events, we know that fi must have been in Ip — Iq 
and /2 must have been in Iq — Ip. So, it is clear that /i and /2 are unordered 
and there is therefore no edge between them in primary- graph{e q) . 

We now have the proposition: 

Proposition 5 Let 6q he a q-event and Cp a p-event, such that Cq succeeds on 
6p (that is, Cq is an external successor of Cp). Let e'q be the q-event just preceding 
6q. Then, q can construct primary-graph{eq) from the graphs primary- graph (cp) 
and primary- graph (e'q). 

Notice that the procedure for updating primary graphs only checks the labels 
of events which actually lie in the primary graphs. Call a q-event e ‘current’ if e 
belongs to primary [cp) for the last (C*-maximum) event Cp of some process p. 
Recall that N is the number of processes in the system. We know that there are 
at most 3A^^-|-fV distinct events in primary (cp) for process p. So, at any given 
time, the number of events across the system which are current is bounded by 
N{ 3 N^+N). 

Each event begins by being current. Meanwhile, as the computation pro- 
gresses, this event may get added to the primary information of other processes. 
However, it gradually recedes into the past, until it drops out of the primary 
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information of all processes. At this time, the label assigned to this event can be 
reused, since the old event with the same label can never become current again. 

A process can keep track of which of its events in the system are current by 
keeping track of one additional level of events, called secondary information. 

4.3 Secondary Information 

Consider a p-event e for some process p. The secondary information of p at e is 
the collection of sets primary (f) for each event / in primary {e). This collection 
of sets is denoted secondary {e) . 

The following lemma says that the only p-events which can be current in the 
system are those which occur in p’s secondary information. 

Lemma 6 Let, at any time, Cp be the last p-event, and Cq the last q-event, for 
some processes p and q. Let Lp = Cpl and Lq = e^J,. Lf e is a p-event which 
belongs to primary {Cq) , then e G secondary {Cp) . 

We will use the preceding result in the following form. 

Corollary 7 Let e be a p-event such that e ^ secondary (cp) . Then e does not 
belong to primary (cq) for the last event Cq of any q G V. 

As long as all processes which refer to the same label in their primary in- 
formation are actually pointing to the same event, reusing labels should cause 
no confusion. Therefore, if p knows that no p-event labeled £ is currently part 
of the primary information of any process in the system, it can safely use £ to 
timestamp the next Write which it performs. 

Secondary information can be updated in a straightforward manner when 
we update primary information — if q inherits an event e from p’s primary 
information, it also inherits the secondary information primary {e) associated 
with e. Notice that it suffices to maintain secondary information as an indexed 
set — we do not need to maintain secondary graphs in a manner similar to 
primary graphs. Note that the number of events in the secondary information is 
less than lOA^^. 

4.4 Labeling from a Bounded Set 

In this subsection we describe precisely how bounded timestamping is performed 
using the results of this section. For each process p let £p be a finite set of 
labels such that \Cp\> Process p uses the set Lp to label its Writes. It 

maintains two primary graphs, the public primary graph in its public memory 
Mp and the private primary graph in its private memory Lp. It also maintains a 
public secondary information in Mp and a private secondary information in Lp. 
The public primary graph and the public secondary information of process p are 
available to other processes through p : X. 

At a Read p-event, only the private primary graph and the private secondary 
information are updated. While at a Write p-event, both the primary graphs 
and both the secondary informations are updated. We describe next what steps 
p has to take at every Read and Write. 
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When p performs a Read ; Process p will read q's public primary graph and 
public secondary information and compare it with its own private primary graph 
and private secondary information. 

— Extract the label ^ of the C*-maximum g-event in g’s public primary graph. 

— If a new event has been read then update the private primary graph and 
private secondary information as described earlier in this section. 

When p performs a Write .• Let e denote this new Write event and e' the 
previous Write p-event. Recall that a Write consists of 1 consumes (for 
each g yf p, consume(g : X)) and an update (update(p : X, p : Y)). 

— On the consume(g : X): The steps are the same as when it performs a Read. 

— On the update(p : X, p :Y): Process p does the following: 

• Choose a label i for e, from Cp which does not appear in the private 
secondary information. 

• Replace e' by e in the latest component of the private primary graph. 

• Replace unacklist{e') in the private primary graph by unacklist{e), con- 
sisting of the events Bl*p, B2*p and Ap* read during the present Write 
event e, as well as e and e' . The ordering among these is available from 
the discarded unacklist. 

• Update the private secondary information in accordance with the change 
in the private primary graph. 

• Copy new private primary graph and private secondary information into 
the public primary graph and public secondary information respectively. 

Putting together all the results we have proved so far, we can state the following 
theorem. 

Theorem 8 The algorithm described above solves the latest gossip problem in a 
shared memory system for computations consisting o/ R eads and Writes, with 
only a bounded amount of additional information being attached to each Write. 

5 Discussion 

In this paper, we have presented a solution to the latest gossip problem in a 
shared memory system. The gossip construction is extremely powerful and imme- 
diately leads to the effective construction of the latest operator, denoted Latest, 
with the expected semantics. This operator may be used to define and maintain 
auxiliary variables such as p : Latestp^g(g : X) which is a local private variable 
of p, but which tracks the latest contents of g : X. The latest operator may 
even be composed and (causally) compared, e.g., the program at process p, may 
check whether p:Latestp^q(Latestg^p(p:flag)) refers to the current contents 
of p : flag. Such variables (and causality comparisons) should prove useful in 
writing parallel programs which meet desired behavioural specifications. Refer 
[5], for an algorithm for mutual exclusion which uses such auxiliary variables. As 
opposed to this, e.g., in [4], for the same mutual exclusion problem, even with 
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a BCTS, the bakery algorithm requires an intricate manipulation of another 
set of variables p : choosing, one for each process p, which is not part of the 
behavioural specification of the mutual exclusion problem. 

The gossip construction immediately leads to a BCTS: The scan operation of 
a BCTS translates to a sequence of 1 Reads, one for every other process. The 
label operation translates to a Write. The output of the scan operation is the 
sequence {Latestp^g \ q G V} ordered by C*. The solution of the global state 
detection problem as posed in [2] is even simpler: the primary information at each 
process p provides a global state. Furthermore, global states are always current 
and may be maintained online without requiring additional communications. 

The gossip problem was originally motivated by problems of logical specifi- 
cation and verification of distributed systems. The solution for synchronization 
systems, as in [8], was crucially used in [9] for the effective construction of a 
trace based-extension of linear temporal logic to reason about synchronization 
protocols. We believe that the shared-memory solution presented here, besides 
being useful in protocol synthesis, will also be useful in developing an automata- 
theoretic framework for protocol verification and logics. 
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Abstract. Vector and matrix clocks are extensively used in asynchro- 
nous distributed systems. This paper asks, “how does the clock abstrac- 
tion generalize?” and casts the problem in terms of concurrent knowl- 
edge. To this end, the paper motivates and proposes logical clocks of 
arbitrary dimensions. It then identifies and explores the conceptual link 
between such clocks and knowledge. It establishes the necessary and 
sufficient conditions on the size and dimension of clocks required to de- 
clare fc-level concurrent knowledge about the most recent global facts 
for which this is possible without using control messages. It then gives 
algorithms to compute the latest global fact about which a specihed level 
of knowledge is attainable in a given state, and to compute the earliest 
state in which a specihed level of knowledge about a given global fact is 
attainable. 



1 Introduction 

1.1 Motivation 

A large number of application areas in asynchronous distributed message-passing 
systems use vector clocks and matrix clocks. Some example areas that use vec- 
tor clocks [6,14] are checkpointing, garbage collection, causal memory, maintain- 
ing consistency of replicated files, taking efficient snapshots of a system, global 
time approximation, termination detection, bounded multiwriter construction of 
shared variables, mutual exclusion, debugging, and defining concurrency mea- 
sures. Some example areas that use matrix clocks are designing fault-tolerant 
protocols and distributed database protocols [9,20], including protocols to dis- 
card obsolete information in distributed databases [18], and protocols to solve 
the replicated log and replicated dictionary problems [20] . 

Vector clocks can be thought of as imparting knowledge to a process: when 
V[i] = X at process h, process h knows that process i has executed at least x 
events. Matrix clocks give one more level of knowledge: when M[i,j] = x at pro- 
cess h, process h knows that process i knows that process j has executed at least 
X events. Vector and matrix clocks are convenient as they are updated without 
sending additional messages; knowledge is imparted via the inhibition-free ambi- 
ent message passing that (i) eliminates control messages by using piggybacking, 
and (ii) dijfuses knowledge using only computation messages, whenever sent. 
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This paper asks the question: “how does this clock abstraction generalize?” 
The problem is cast in terms of concurrent knowledge (“everybody knows on 
consistent cuts”), which is a form of knowledge appropriate for (time-free) asyn- 
chronous distributed systems [16] — all the applications mentioned above im- 
plicitly use concurrent knowledge that is not common knowledge in their clock 
algorithms, although this has never been formally studied as such. 

1.2 Background 

A distributed system can be modeled by a network {N, L), where N is the set of 
processes that communicate by message passing over L, the set of logical links. 
We assume an asynchronous distributed (message passing) system, i.e., there 
is no global clock or shared memory, relative process speeds are independent, 
and message delivery times are finite but unbounded [2,16]. Common knowledge, 
which has been proposed as a definition of agreement in distributed systems, is 
defined as follows [8] . A process i that knows a fact (p is said to have knowledge 
Ki{<f)), and if “every process in the system knows p” , then the system exhibits 
knowledge E^{(f)) = /\ieAr A knowledge level of E‘^{(j)) indicates that every 

process knows E^{<j>), i.e., E'^{<j>) = E {E^{(f))). Inductively, a hierarchy of levels 
of knowledge E?{<j>) {j > 0) gets defined, where A^+^((()) E^{(j>). Common 

knowledge of p, denoted as C{4>), is defined as the knowledge X which is the 
greatest fixed point of E{<j) A X) and is equivalent to Ajez* where Z* is 

used to denote the set of whole numbers. Common knowledge requires simulta- 
neous action for its achievement and is therefore unattainable in asynchronous 
distributed systems [5,7,8,17]. 

Panangaden and Taylor proposed concurrent common knowledge (CCK) 
which is required to be attained simultaneously in logical time based on causality 
[13], and is attainable in asynchronous distributed systems [16]. Specifically, CCK 
can be attained at a “consistent cut” or possible global state [1] in the system 
execution. To define concurrent common knowledge, [16] first defines Pi{(f>) to 
represent the statement “there is some consistent global state of the current ex- 
ecution that includes i’s local state, in which p is true.” E^{(p) = AieAr EiPi{4>) 
and is attainable by the processes at a consistent global state. Likewise, higher 
levels of knowledge {E^)^{(j)), for fc > 1, are attainable by the processes at a 
consistent global state. Concurrent common knowledge of <j), denoted by 
is defined as the knowledge X which is the greatest fixed point of E'^ {(p A A) 
and is equivalent to Ajez*(^^)^ ('(’)• This form of knowledge underlies many 
existing protocols involving processes reaching agreement about some property 
of a consistent global state, defined using logical time and causality. 

C^{p) (iP^)l(^)(j e Z*). Several applications (see Section 1.1) need 

only lower levels of such knowledge. Vector clocks [6,14] provide specific knowl- 
edge of a global fact/state p, equivalent to concurrent knowledge (E^)^{p), in 
the application domain. However, vector clocks are not sufficient for other ap- 
plications, for which it is necessary to use matrix clocks. Matrix clocks provide 
concurrent knowledge (E^Y{p) about facts p in the application domain. Thus, 
although levels of concurrent knowledge (besides CCK) have not received formal 
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attention besides [16], they are implicitly used in a wide range of applications. 
Hence, studying levels of concurrent knowledge is important. 

Two important and desirable characteristics of the clock protocols used to 
achieve and {E^Y{(j)) are that they do not use any control messages 

and they diffuse knowledge on a continual basis, using piggybacked timestamps 
on the application messages as and when they are sent (see Section 1.1). These 
clock protocols are not full-information protocols [5], yet for each of the above 
applications, they suffice to provide the required degree of concurrent knowledge 
because the clocks are defined so as to capture the property of interest. 

All other known protocols to attain concurrent common knowledge and levels 
of concurrent knowledge are variants of the global state recording algorithm 
[1,4,16]. Such global state protocols require (i) 0{min{k ■ |fV|, |L|)) messages to 
attain {E^ff{(j)) and 0{\L\) messages to attain (ii) 0{d) communication 

time steps, where d is the network diameter. In the above, the |L| factor in the 
message complexity can be reduced to |fV| if inhibitory protocols are used, at the 
cost of inhibitory time delays [4]. All protocols based on global state recording 
require control messages for a one-time knowledge attainment of each fact, and 
may additionally use freezing/inhibition. Hence, they are not considered further. 

In Theorem 1, we show that for the class of facts we consider and which 
includes the applications listed in Section 1.1, {E^ff{(j)) is equivalent to E^{(f)). 
So we also refer to as just E^{(j)). 

1.3 Objectives 

This paper examines the feasibility of and mechanisms for achieving levels of 
knowledge E^{<j)) {k > 0, 1) using ambient message passing, i.e., 

1. No control messages can be used. Control information may be piggybacked 
on computation messages. Also, no freezing/inhibition is allowed. 

2. The latest knowledge about the (past) computation should be diffused as 
much as possible, using only the computation messages, whenever sent. 

As justified in Section 1.2, we focus only on clock-based protocols. We now for- 
malize the objectives. The full-information protocol (FIP) which attains common 
knowledge (in a synchronous system) has been defined such that “at each step 
(after each local event), a process broadcasts (via control messages) to other pro- 
cesses its local state (which captures everything it knows)” [5]. The FIP is very 
expensive, and does not meet our criteria. We now define a “no control messages” 
protocol, the full-information piggybacking protocol (FIPP), to be one in which 
on each computation message of the application, the local state information is 
piggybacked by the sender. This protocol meets the criteria but is expensive 
in terms of the information piggybacked. We define the fc-bounded-information 
piggybacking protocol to overcome this drawback of the FIPP. 

Definition 1. The k-bounded-information piggybacking protocol (KIPP) is 
such that on each computation message of the application, k-bounded state in- 
formation^ is piggybacked by the sender j, where k-bounded information is in- 

^ Presumably about some property of interest. 
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formation of the form: {Ki^ . . . {Ki^ . . .)), where ■ ■ - ik G N, for 

any fact (j> on the system state. 

Facts about the property of interest, which are a function of the system state, 
are represented by the timestamp of that system state in the applications of 
Section 1.1. Similarly, 0- and 1-bounded information about these facts are also 
represented as timestamps in these applications. Therefore, we will assume that 
appropriate timestamps can represent facts relevant to the application, and k- 
bounded information about them. The type of facts considered in Section 1.1 
and which we will consider satisfy monotonicity. Informally, ^ is a monotonic 
fact in a run if (f) holds in some global state, and for every later global state, 
some Ip holds and ip (p in that later state (see Definition 5). Monotonic facts 
are also stable. The paper answers the following questions. 

Problem 1. In a system using the KIPP protocol, what are the necessary and 
sufficient conditions on the timestamp information required to achieve and de- 
clare E^{(p), where <p is the greatest possible monotonic fact (most recent possible 
system state) about which E^{(p) is possible to be declared in the current state? 

Problem 2. For any global monotonic fact (p on the system state, what is the 
earliest global state in which E^ {(p) is attained using the KIPP protocol? 



Problem 3. Given a timestamp of a system state, what is the maximum possible 
monotonic fact <p (most recent possible system state) about which E^{(p) can be 
declared in the given state in a system using the KIPP protocol? 

Section 2 describes the system model and existing clock systems. Section 3 
defines monotonic facts. Section 4 proposes a-dimensional clocks. Section 5 an- 
alyzes the levels of knowledge that can be inferred using a-dimensional clocks 
and answers the above problems. Section 6 concludes. See [12] for the full paper. 

2 Preliminaries 

2.1 System Model 

We assume an asynchronous distributed system (see the first paragraph of Sec- 
tion 1.2). The notion of the local state of a process is primitive. An event e at 
process i is denoted e^. An event causes a local state transition. The local history 
of process i, denoted hi, is a possibly infinite sequence of alternating local states 
(beginning with a distinguished initial state) and events [16]. It is equivalently 
described by the initial state and the sequence of local events. 

Formally, an asynchronous distributed system consists of (i) a network {N, L), 
(ii) a set Hi of possible local histories for each process i, (iii) a set A of asyn- 
chronous runs or executions, or computations, each of which is a vector of local 
histories, one per process, and (iv) a set of messages sent in any possible asyn- 
chronous run. The system follows the KIPP protocol (Definition 1). 
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A given run of a distributed system has a poset event structure model as in 
[13]. Let {H, represent the set of events iL in a system run that are related by 
the causality relation an irreflexive partial order [13]. H is partitioned into 
local executions, one per process. Each local execution defines the local history. 
We assume the initial state of each process is common knowledge. 

A global state (or cut) of run a is a n- vector of prefixes of local histories of a, 
one prefix per process. It can be viewed equivalently as the union of the events 
in prefixes of the local histories of a, one prefix per process. A consistent global 
state {consistent cut) is a global state such that if the receipt of a message is 
recorded, then the sending of that message is also recorded [1]. It can be viewed 
equivalently as a downward-closed subset of H. Let denote the empty cut. 

For a given run, the set of all cuts. Cuts, forms a lattice ordered by “C” 
(subset); the set of downward-closed cuts is its sublattice [14]. The seq. of states 
in actual time is a chain in this sublattice. The sublattice is not visible to any 
process, but gives the possible consistent cuts which could have occurred and 
are “valid” views of the run. Our results implicitly deal with such a run. 

We define F{Cut) to be the set consisting of the latest event at each process 
in cut Cut. F{Cut) denotes the “front” of cut Cut. 

Definition 2. F{Cut) =def {ci G Cut \ Ve' G Cut,e{ ^ Ci} 

Given a cut Cut, its projection Fi{Cut) is the element of F{Cut) at process i. 

Define |e as |e =def W \ c' -< e}. The cut |e has a unique maximal event e 
and is downward-closed in (H,^). As the set of all downward-closed cuts forms 
a lattice, therefore denoted as and U^AT, 

resp., are downward-closed cuts for any set of events X. These cuts are used to 
prove Theorems 6 and 7. Based on the definition of J,e, we can assert as follows. 



Proposition 1. e G n^AT 4 =^ \/x € X, e < x 

the largest set of events that causally precede every x G X, represents 
the largest execution prefix with the following property: any fact in this execution 
prefix can be known in the local state of each process after event x G X [11]. 

A Kripkean interpretation of knowledge modality requires the identification 
of an appropriate set of possible worlds - in the system model, the possible 
worlds are the (consistent) cuts of the set of possible asynchronous runs [7,8]. 
(a, c) denotes cut c in run a. Standard definitions for the modal operators Ki 
and Pi, and for various forms of knowledge are used. The formal semantics are 
given by the satisfaction relation ^ and are the same as in [16]. Proposition 1 
can now be expressed in this logic. Assuming that adequate knowledge about 
local histories is propagated, for any cut X, (a,X) ]= E{r\ij,F{X)), i.e., all the 
processes know n 4 F(AT) after execution of X. 

2.2 Logical Clocks 

Logical clocks track causality which determines the extent of the past computa- 
tion that could possibly be known at any state/event. A clock is a function that 
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maps cuts in a run to elements in the time domain T. Thus, Clk : Cuts i— > T. 
Clocks provide a quantitative identifier for cuts. For any run, the timestamp of 
a cut (which is the union of a prefix of the local history of each process), is 
defined using timestamps of cuts of the form J, e. When we say that an event e 
is assigned a clock value/timestamp, more formally we mean that the cut |e is 
assigned that clock value/timestamp. Also, a subscripted timestamp Ti denotes 
a timestamp of an event at process i, and |A^| is also denoted as n. 

Scalar clocks [13], vector clocks [6,14], and matrix clocks [9,18,20] are the only 
clocks proposed in the literature. A canonical clock updates the local component 
of the clock by one at each local event. Henceforth, we assume canonical clocks. 
A canonical vector clock assigns timestamps to an event as follows. 

Definition 3. T(e) =def (* G N) T{e)[i] = \{ei | Cj ^ e }j, i.e., T{e)[i] is the 
number of events on process i that causally precede or equal e. 

For any run, vector clocks of size n track the progress at each process (and are 
needed to capture concurrency; see discussion on dimension of (H,^) [3]). For 
cut Cut, we define its timestamp T(Cut) such that its ith component is the zth 
component of the timestamp of event Fi{Cut) [11]. 

Definition 4. T{Cuf) =def (* G N) T{Cut)[i] = T{Fi{Cut))[i] 

The vector timestamp of a cut identifies the number of events at each process 
in the cut. For any run, there is an isomorphism between Cuts and T^, the set 
of canonical vector timestamps such that {T G T^) T[i] < \hi\ in that run. 

Proposition 2. For a run (F[,~<), (Cuts,c) is isomorphic to (T^,<). 

Lemma 1. The timestamp of cut Hicex » denoted T(nj|A), is expressed as 
a function of the timestamps of the members of X as follows, (i G N) T(n 4 A)[z] 
= minxex(T{x)[i]). 

In Lemma 1, X can be an arbitrary set of events, also termed a nonatomic 
event [10,11]. Lemma 1 will be shown to have a counterpart Lemma 3 that is 
based on higher dimensional clocks, and which is used in the proof of Theorem 7. 

For any run (FI, -<), observe from Definition 2 that there is a bijection from 
the set containing each cut Cut to the set containing each front of a cut F{Cut). 
So, the timestamp of F{Cut) is defined to be the timestamp of Cut. 

3 Monotonic Facts 

We now define monotonic facts - such facts capture the relevant properties of 
the applications in Section 1.1, and it is this class of facts that we consider. Ex- 
amples of such facts are “computation has progressed at least up to global state 
state-vectoF' , and “all logs upto global state statexvector can be discarded” . As 
in the applications in Section 1.1, we assume facts of interest are related by a 
semantic inclusion relation “C” (if 4> Qip, then ip semantically includes (p). 
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Definition 5. For a given run a, any fact (f is monotonic iff for every cut c at 
which (a,c) |= f), and for every cut c' such that c A c', there exists some fact ip 
such that (a, c') |= ip and (a, c') |= (<p Q ip). 

Monotonic facts are also stable; however, not all stable facts are monotonic. 

Lemma 2. For a monotonic fact (p, the following are all stable facts: (p, Kifcp), 
KiPi{(p), E{(p), and E^{(p). 

Let Ip be any of (p, Ki{(p), KiPi{(p), E{(p), and E^{(p), where ^ is a monotonic 
fact. When process m receives a message with ip piggybacked on it from process 
j at event resulting in local state we have slf^ ^ KmPmKj{ip). Using 
Lemma 2, we can show that “the Pm operator can be safely removed”, and hence 
h KmKjpip) (and also |= Km{ip))- Similarly, f\^KiPi{ip) is equivalent 
to f\^ Ki{ip). Developing this idea further leads to Theorem 1 that allows us to 
replace concurrent knowledge with the equivalent normal knowledge. 

Theorem 1. In a system following the KIPP protocol, the greatest possible 
monotonic fact <p about which {E^)^{(p) is possible to be declared in a given 
state is the greatest possible monotonic fact (p' about which E^{(p') is possible to 
be declared in the given state. 

As in the applications of Sect. 1.1, we assume that for any run, the set of 
monotonic facts ordered by C is a lattice, there is a semantically greatest fact in 
each state, and that there is an (iso/homo)morphism from (Cuts, c) to (A4, E), 
where Ai is the set that contains the greatest monotonic fact (of interest) at 
each cut in (Cuts,cP). Combining this (iso/homo)morphism with Prop. 2 (and 
restricting to consistent states for semantic integrity) leads to Prop. 3. 

Proposition 3. The (semantically greatest) monotonic fact of interest in a 
global state, whose truth value is a function of that global state, will be uniquely 
identified by the timestamp of that global state. 



4 Clocks of Arbitrary Dimensions 

Definition 6. An a-dimensional clock Clk°‘ defines the mapping Clk°‘ : Cuts 
(Z*)” (i.e., Clk'^ is an a-dimensional array of integers, where each dimension 

is of size n ), satisfying the following properties. 

SPl. The local clock component at process j, Clkf[j,j, . . . ,j], is common knowl- 
edge in the initial system state, i.e., (a,H-^) |= C(Clkf[j, j, . . . ,j]). 

SP2. The local clock component at process j, Clkf[j,j, . . . ,j], must be incre- 
mented by a natural number when a computation event occurs at j. 

SP3. Any element Clk°‘(ej)[ii,i 2 , . . . ,ia] is the maximum scalar clock value (pi^ 
— Clki^\io,,%Q,, . . . ,i(y\ at %c such that Kj (Ki-^ (K 12 (E^^ (■ ■ ■ ) ■ ■ ■))))■ 
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RO. (Initial state:) Clkf = a dimensional 0-vector 

Rl. (Internal event:) Before process i executes the event, = 

ClkT[i,i,...,i] +d{d>0) 

R2. (Send event:) Before process i executes the event, Clk!j‘[i,i, . . . ,i] = 

Clkf[i, i, . . . ,i] + d {d > 0). Send message timestamped with Clkf . 

R3. (Receive event:) When process j receives a message with timestamp T“ from 
process i, 

1. for /3 = 1 to a — 1 do 

Vgi e N\{j},yq2,q3,...,qf3 & N, 

Clkfl , 91,92,^. ■ ■ ,g/; ] = 

CK — /3 times /3 entries 

max{Clkf[ , 91 , 92 ,^- ■ . ,g/; ],r°^[ , 9 i, 92 ,^. ■ ■,qp \) 

a — f3 times /? entries ct — /3 times 0 entries 

2. Vgi e N\{j},\/q 2 ,...,qa G N, 

Clkj[qi,q 2 , . . . ,qa]= max(Clkj[qi, q 2 , . . . , qd, T“[<Ji, 92 , • • • , 9c]) 

3. Clkf [j, j,..., j] = Clkf [j, j,...,j]+d{d> 0) 

4. Deliver the message. 



Fig. 1. Protocol to operate a-dimension clocks 



With canonical clocks, d = 1 and Clk°‘{e)[ii,i 2 , ■ ■ ■ ,ia] = \Fiai- ■ ■ i Fisii 
Fi^a Fi^{l e)))...)|. The value of Clk^ assigned as a timestamp is denoted 
T“. T“[z], also represented as T“[z, •], is a timestamp of dimension {a — 1) 
and is derived from T“ by instantiating the first dimension variable zi by 
z. T“(cp)[z, •] is the (a — 1) dimensional timestamp of the most recent event 
at process z, as known to process p after event Cp. Moreover, this most re- 
cent event at process z has a scalar timestamp T“(ep)[z, z, . . . , z]. In terms of 
knowledge, T°"{ef)[i, •] represents the knowledge sf ^ Kp{KiKi^Ki^ . . . 
where only zi is instantiated by z in Kp{Ki^Ki^Ki^ . . . Ki^{(j))), for all 
zi, Z 2 , ... ia G N. Analogously, T“[ a, 6, ...,/ ,•] is a timestamp of 

(3 entries, / 3>0 

dimension {a — (}). T“(ep[a, 6, . . . , /, •] represents the knowledge sf |= 
Kp{KaKb . . . . . . Ki0(j))), where the first fi dimension variables 

zi, . . . , Z /3 are instantiated in Kp{Ki,Ki^Ki^ . . . Ki^ (<(')), for all zi, Z 2 , . . . Zq, G fV. 
When p= (a = b = ... = /), T“(ep[a, 5, . . . , /, •] is effectively a (a — /?) dimen- 
sional timestamp of ef. 

Theorem 2. The protocol in Fig. 1 implements the a-dimensional clock speci- 
fication of Definition 6. 

The protocol in Fig. 1 has a space and time complexity of 6>(n“). Rules (R3.1 
and R3.2) can be simplified using simple observations, as shown in [12]. The size 
of each clock of dimension a. is n“ integers. This clock/timestamp size may 
be reduced by using information such as the message pattern, logical network 
topology, and the partial order using analysis such as in [15,19,20], or 

by using approximations to the true clock, using schemes such as in [9,20]. 
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The a-dimensional timestamp of a cut is defined using the (a— l)-dimensional 
timestamp of the latest event at each process in that cut. 

Definition 7. T°‘{Cut) =def {i G N) T°‘{Cut)[i,-] = T°‘{Fi{Cut))[i,-] 

Lemma 3. The timestamp of cut > denoted is expressed 

as a function of the timestamps of the members of X as follows, {i G 
A^)T“(n 4 AT)[z, •] is the {a — 1)- dimensional timestamp •], where 

T°‘{x')[i,i , . . . ,i] = minx^x{T°‘{x)[i,i , . . . ,z]). 

Lemma 3 gives a way to implement the test for Proposition 1. It will be used 
in Theorem 7 to identify the maximum computation prefix (f> (cut) about which 
knowledge has been attained at a given cut Cut. 

Recall that by Proposition 3, the problem of identifying the minimum possible 
computation prefix (cut) c such that (a, c) \= E^{<j)) for a given (f (Problem 2) 
is equivalent to the problem of identifying the minimum possible computation 
prefix c such that (a, c) \= E^{Cuf), where Cut is the cut in which <j) is true. 
Likewise, the problem of identifying the maximum possible fact (j) such that (a, c) 
1= E^{(f>) at a given cut c (Problems 1 and 3), is equivalent to the problem of 
identifying the maximum possible computation prefix Cut such that (a, c) |= 
E^{Cuf). We now give the main results linking clocks and knowledge. 

5 Attaining Knowledge Using Clocks 

At process i, fc-bounded knowledge (of global facts about a property of interest) 
is of the form . . . Ki^.((j))). The number of unique permutations of 

the Ki^ operators that represent fc-bounded knowledge is computed as follows. 
ii yf i, and Vj G [2, k], ij yf ij-i - Thus, Vj G [1, k], ij can take one of n — 1 values, 
giving (n— 1)^ permutations; each permutation denotes a global fact about which 
/c-bounded knowledge exists at process i. Each global fact is represented by a 
cut and requires a vector (n integers). Thus, the space for fc-bounded knowledge 
attisn-(n — 1)^ integers. The space requirement for all levels of knowledge 
upto k at process t is n • ~ integers. 

Lemma 4. Representation of k-bounded knowledge (of global facts about a prop- 
erty of interest) needs n • ~ integers. 

From the inequality <> n • ~ ^ , we now get Theorem 3. 

Theorem 3. k-bounded knowledge (of global facts about a property of interest) 
cannot be represented by a k-dimensional clock system, but can be represented 
by a {k -\- 1)- dimensional clock system. 

By definition, E^{(j)) = f\^KiPi{E^~^{(f>)), i.e., each process knows E^~^{(j)) 
along some (consistent) global state. To identify the bounds on space com- 
plexity to determine E^{(j)) knowledge for the latest possible (() in a system 
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(1) Problem Inputs: 

(la) array of int //vector timestamp of earliest state in which (f> is true 

(lb) int k\ //level of knowledge -E*^(</>) to be attained 

(2) Problem Output: 

(2a) array of int TS^ = ComputeState{T^, k). 

(2b) //vector timestamp of earliest state in which E^{(l)) is attained 

(3) function ComputeState( array of int t1: int k) returns TS^ 

(4) for lvl = ltok + ldo 

(5) Vp € A" do 

(6) identify earliest event Cp \ T^(ep) >TS^-, 

(7) r\p]^T\ep)[p]-, 

(8) Vp e A do 

(9) TS^Ip] = maa;(r^(ei)[p],r^(e 2 )[p], . . . , T^(e„)[p]); 

(10) // {a,TS^) 1= a /S TS'^ \ [TS'^ < TS^ A {a,TS'^) A E'-'^\tI)) 

(11) return(T5'^). 



Fig. 2. Given (j), protocol to compute earliest system state in which is 

achievable 



following the KIPP protocol (Problem 1), it can be shown that Vzi, 12 , • • ■ 
Ai(Aij . . . , (■ 0 ii,i 2 ,...,ifc )) must be available at each process z, where 

is til® max. execution prefix about which the corresponding knowl- 
edge is available, i.e., “z knows zi knows Z 2 knows . . . Zfc knows The 

max. execution prefix cj) about which E^{(j>) is attained is given by 

V'ii,'i2,...,ifc ■ 

Theorem 4. In a system following the KIPP protocol, k-bounded knowledge at 
each process is required to attain and declare E^{(f), where <j> is the maximum 
possible monotonic fact (most recent possible system state) about which E’^{<j>) 
is possible to be declared in the current state. 

Theorem 5 (= Thms. 3-1-4) and Theorem 6 answer Problems 1 and 2, resp.. 

Theorem 5. In a system following the KIPP protocol, a {k+l)-dim clock system 
is sufficient but a k-dim clock system is not sufficient to attain and declare E^{<j)), 
where f is the maximum possible monotonic fact (most recent possible system 
state) about which E^{(j>) is possible to be declared in the current state. 

Theorem 6. Given a global monotonic fact cj), the earliest global state in which 
E^{(f)) is attained in a system following the KIPP protocol is given by the protocol 
in Fig. 2. 

Given the earliest cut where f becomes true, specified by T^, (which by 
Proposition 3 captures fact f), Fig. 2 gives a protocol to determine the earliest 
global state at which E^^f) can be attained. The protocol is iterative. Function 
ComputeState uses two inputs: (i) T^, the vector timestamp of the earliest 
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state in which (j) holds, and (ii) fc, the level of knowledge E^{(j)) to be attained. 
The output is TS^, the vector timestamp of the earliest state in which assertion 
can be made. The protocol is proved correct by showing that the invariant 
in line (10) holds after each iteration. Note that in each iteration, T'^ (line (7)) 
identifies a global state that may not be consistent; hence a consistent global 
state TS^ (line (9)) is computed. 

Complexity: Time complexity is (# send and receive events in {H, after the 
cut at which (j) is defined). Space complexity is that of a vector clock system, 
and also requires each process to store a trace of the timestamps of its send and 
receive events beyond the cut at which (j) is defined. 

To answer Problem 3, “Given a timestamp of a state, what is the 

maximum (p such that ^ E^{<j))T" we can apply the function min to 

the 1-dimensional timestamps of size n in the given . This requires n-n^ 
comparisons. Theorem 7 gives a solution of 0{(3 ■ {n^ + n)) time complexity. 

Theorem 7. Given a timestamp , the maximum possible monotonic fact (p 
(most recent possible system state) about which E^{p), where k < (3 — 1, can be 
declared at the given state in a system following the KIPP protocol is given 
by the protocol in Fig. 3. 

The proof is by construction. Fig. 3 gives a protocol to derive the max. 
computation prefix p about which the processes have knowledge E^{p), given the 
timestamp ObPP^ , where (3 > k. Compute-Phi has inputs (i) T“, the (variable 
dim.) timestamp of the maximum cut about which knowledge is attained 
in Ob-T^, ( ii) m, the level of knowledge that is yet to be attained, and (iii) atn, 
the level of knowledge already attained. The output is the timestamp p of the 
max. cut about which E’^ knowledge is attained in the given state Ob-T^ . 

Compute-Phi is invoked as Compute-Phi{Ob-Tf^ , k,0) and is tail-recursive. 
T“ is progressively decreased at each recursion level to add another level of 
knowledge to what is known of T“ at cut Ob-T^ . So at each additional recursion 
level, T“ therein converges towards p. Each recursion level behaves as follows. 

— Given T°‘{Cut), the loop in lines (5)-(6) computes the {a — l)-dimensional 

timestamp of Fp{Cut) which is the latest event of the cut Cut at process p, 
{p € N). {Fp(Cut)) is simply T“[p, •]. 

~ Let X denote the events F{Cut) identified in line (6). The loop in lines 
(7)-(9) applies Lemma 3 to AT to compute By doing so, it identifies 

the timestamps T^°"~"^\Fp{C\^X)) for each process p. Then T(““^)(nj|Al) is 
simply the aggregation of the n timestamps (Fp{nij.X)), as shown in 

line (10). By Proposition 1, is the timestamp of the maximal prefix 

about which all the processes have knowledge at A" = F{Cuf) and this can 
be asserted only at or after F(Cut). Thus, E(T^°‘~^'>) holds in the state with 
timestamp T“ in this recursion level and we assert the invariant on line (11). 

— The above steps also add a level of knowledge to that at the given initial state 

Ob-T^; we assert this in the invariant on line (13). If this is the desired level 
of knowledge, then we have the terminating case for the recursion and the 
value of is returned (lines (14)-(16)), else Compute-Phi is recursively 




500 Ajay D. Kshemkalyani 



(1) Problem Inputs: 

(la) /3-dim. array of int Ob.T^\ // timestamp of observation state 

(lb) int k, where /3 > fc > 1; // level of knowledge to be attained 

(2) Problem Output: 

(2a) (/3 — k) dim. array of int (j) = Compute-Phi{ObJT^ , k, 0). 

(2b) //timestamp of maximum possible state such that (a,ObJT^) \= E^{(f)) 



(3) function Compute-Phi{\raLr dim. array of int T“; int m, atn) returns (j> 

(4a) j j T“ is timestamp of the max. possible state such that {a,Ob.T^) \= 

(4b) // m is the level of knowledge yet to be attained 

(4c) / / atn is the level of knowledge already attained, atn = k — m. 

(5) ypeN do 

(6) =T“[p,.]; 

(7) Vp G AT do 

(8) let r be such that Tf‘~^[p,p, ■ ■ ■ ,p] = minqeN{Tq~^[p,p, . . . ,p]); 

(9) T“-2=T“-1[p,.]; 

(10) 

(11) // (a,T“) 1= £;i(T“-i) /\ ^ I (T'“-i > T°‘~^ A(a,T“) |= 

(12) atn = atn + 1; m = m — 1-, 

(13) // (a,06.T^)^£;“‘"(T“-i)/\^T'“-i| (T'““i > T““ V(a, 06.T^) |=£;“*"(T'“-i)) 

(14) if m = 0 then 

(15) = 

(16) return((/)); 

(17) else 

(18) <f) = Compute-Phi{T°‘~^ ,m,atn)-, 

(19) // (a,T“-i) h 73™(</) A ^ I (A >/> A(a,T“-i) A 

(20) // (a,T“) A A /a I A(a,T“) A s-'+AA)) 

(21) // (a, Ob.T'’) h £;“*"+'"(</>) A A A I (A > /> A(o, Ob.T^) \= S“‘"+’"(</>')) 

(22) return((/)). 



Fig. 3. Protocol to compute latest <j) for which holds in a state with 

timestamp where (3 > k 



invoked to determine the greatest (j) that is known at T^“ for the remaining 
m levels of knowledge to be attained (lines (17)-(18)). 

The invariants on lines (ll,13,19)20j21) are seen to hold. Hence, <j) is the max 
prefix such that (a, ObST^) |= E^{(j)), derived from the recursive use of Lemma 3. 
Complexity: The time complexity is 0{k ■ (n^ + rr)). The space complexity is 
that of /3-dimensional clocks, which is 0{n^) integers and meets the tight bound 
established by Theorem 5. The time complexity is less than the space complexity 
because information is selectively accessed dynamically. 

Necessary and sufficient conditions required to declare E^{(j)) using the KIPP: 
Lemma 4 and Theorem 4 together give the conditions on the exact size of clocks, 
whereas Theorem 5 gave the conditions on the dimension of clocks. 
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6 Concluding Remarks 

So far, concurrent knowledge has been studied much less than normal knowledge 
although asynchronous systems are much more prevalent than synchronous ones. 
This paper made significant contributions to the theory of concurrent common 
knowledge and proposed logical clock systems of arbitrary dimensions. Specif- 
ically, it made the following contributions, (i) It motivated and proposed log- 
ical clocks of arbitrary dimensions, and also formalized the KIPP protocol for 
knowledge transfer used by such clock systems, (ii) It showed that there exists a 
tight relation between the dimension of logical clocks and the level of concurrent 
knowledge attainable, and established some complexity bounds. Here it iden- 
tified and explored an important conceptual link, (iii) It proposed algorithms 
to compute the latest global fact about which a specified level of knowledge is 
attainable in a given state, and to compute the earliest state in which a specified 
level of knowledge about a given global fact is attainable. 
Acknowledgements: This work was supported by the U.S. National Science 
Foundation grant CCR-9875617. 
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Abstract. We introduce a strict hierarchy {Lj^} of language classes 
which exhausts the class of starfree regular languages. It is shown for 
all n > 0 that the classes have decidable membership problems. As 
the main result, we prove that our hierarchy is levelwise comparable by 
inclusion to the dot-depth hierarchy, more precisely, contains all lan- 
guages having dot-depth n-|-l/2. This yields a lower bound algorithm for 
the dot-depth of a given language. The same results hold for a hierarchy 
{L^ } and the Straubing-Therien hierarchy. 



1 Introduction 

We contribute to the study of starfree regular languages (SF, for short) which 
are constructed from alphabet letters using Boolean operations together with 
concatenation. To determine for a given language the minimal number of al- 
ternations between these two kinds of operations is known as the dot-depth 
problem, recently considered as one of the most important open questions on 
regular languages [9] . For an overview we refer to [8] . 

We deal with the dot-depth hierarchy [3] and the Straubing-Therien hier- 
archy [12,15,13], which both formalize the dot-depth problem in terms of the 
membership problems of their hierarchy classes. Fix some finite alphabet A with 
\A\ > 2. For a class C of languages let Pol(C) be its polynomial closure, i.e., the 
closure under finite union and concatenation, and denote by BC(C) its Boolean 
closure. The classes Bn /2 of the dot-depth hierarchy (DDH) and the classes Cn /2 
of the Straubing-Therien hierarchy (STH) can be defined as follows. 

B ^/2 := Pol({{a} : a G A} U {A+}) £ 1/2 := Pol({A*aA* : a G A}) 

:= BC(.B„+i/ 2 ) forn > 0 £„+i := BC(£„+i/ 2 ) forn > 0 

Bn+ 3/2 ■= Pol(.B„+i) forn > 0 Cn+ 3/2 ■= Pol(£„+i) for n > 0 

By definition, all these classes are closed under union and it is known, that they 
are also closed under intersection and under taking residuals [1,10]. Up to now, 
levels 1/2, 1 and 3/2 of both hierarchies are known to be decidable [11,7,1,10,6] 
while the question is open for any other level. Partial results are known for level 

2 of the STH which is decidable if a two-letter alphabet is considered [14] . 
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We take up the discussion started in [ 6 ] and look at known results of the type 
“L belongs to the class C if and only if the accepting automaton does not have 
subgraph S in its transition graph”. Such a forbidden pattern characterization 
implies decidablility of the membership problem of C, and even more, it reflects 
the effect of language operations in the structure of automata. The present paper 
continues this approach in a natural way. 

More precisely, we observe how the forbidden pattern characterizing C1/2 
acts as a building block in the forbidden pattern which characterizes C3/2 [ 10 ]. 
Surprisingly, we And this observation confirmed, if we compare the pattern for 
^1/2 [ 10 ] with the characterization of B3/2 [ 6 ]. Note from the definition above that 
in both hierarchies we get with the same operations from one level to the next. 
Together, this motivates the introduction of an iteration rule IT on patterns, 
which continues the just observed formation procedure. 

In general, starting from an initial class of patterns X, our iteration rule 
generates for n > 0 classes of patterns which in turn define language classes 
by prohibiting the patterns P^ in the transition graphs of deterministic finite 
automata. We prove that UcoL^ C ncoL^_i_i and, as the main technical 
result, that Pol(coI 4 ) C holds (cf. Theorem 1 ). With the latter we relate 
in a very general way Boolean operations and concatenation to the structural 
complexity of transition graphs. 

Then we apply our results to particular initial classes of patterns B and C 
corresponding to the DDH and STH, respectively. As a consequence, we obtain 
decidable hierarchies of classes L® and Uf which exhaust the class of starfree 
languages and for which it holds that: 

^1/2 = Lg £1/2 = U 

B3/2 = Lf £3/2 = Li 

Bn+1/2 £ L® Cn+l/2 £ 

These inclusions imply in particular a lower bound algorithm for the dot-depth 
of a given language L. One just has to determine the class L® or Lj] for minimal 
n to which L belongs and it follows that L has at least dot-depth n (for another 
lower bound result see [ 17 ]). However, it remains to argue that the forbidden 
pattern classes are not too large, e.g., if they all equal SF nothing is won. For 
this end, we provide more structural similarities between the DDH and STH and 
the forbidden pattern classes: All hierarchies show the same inclusion structure 
(see Fig. 4 ) and, interestingly, the typical languages that separate the levels 
of the DDH and STH also separate levelwise our forbidden pattern classes. In 
particular, it holds that Uf (just as Ln+ij2) does not capture Bn+i/2- 

2 Preliminaries 

All definitions of language classes will be made w.r.t. the fixed alphabet A. The 
empty word is denoted by e, the set of all non-empty words over A is denoted by 
A+ . We consider all languages as subsets of A+ . For a class C of languages the 
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set of complements is denoted by coC := { A~^ \L | L € C }. For a word w G A* 
denote by |w| its number of letters. A deterministic finite automaton (dfa) F is 
given by F = {A, S, 6, sq, S'), where A is its input alphabet, S is its set of states, 
(5 : A X S' ^ S' is its total transition function, sq G S is the starting state and 
S' C S is the set of accepting states. We denote by L{F) the language accepted 
by F. As usual, we extend transition functions to input words, and we denote 
by |F| the number of states of F. We say that a state s G S has a loop v G A* 
(has a u-loop, for short) if and only if S{s,v) = s. If a dfa F is fixed we write 
Si S 2 instead of “(5(si,w) = S 2 ”, and si — > S 2 instead of “6{si,w) = S 2 
for some w € A*”. Every w G A* induces a total mapping (5™ : S ^ S with 
(5’"(s) := i5(s,w). We define that a total mapping S' : S ^ S leads to a certain 
structure in a dfa (for instance a u-loop) if and only if for all s G S the state 
S'{s) has this structure (has a u-loop). We will also say that w G A* leads to a 
certain structure in a dfa if (5™ does so. An obvious property of dfa’s is that they 
run into loops after a small number of successive words in the input. 



Proposition 1 . Let F be a dfa, w G A*, r > |F|. Then w'' leads to a w'''-loop. 

The following inclusion relations in each hierarchy are easy to see from the 
definitions. We can compare the hierarchies to each other in both directions. 

Proposition 2 . It holds that Bn+1/2 U C0;B„_|_i/2 C Bn+i C Bn+z/2 F C0;8„+3/2 
and Cn+1/2 U co/l„_|_i/2 F Cn+i Q •Cn+3/2 F coCn+s/2 for n > 0 . 



Proposition 3 . For n > 1 the following holds. 

1 - •^n-1/2 F Bn-i /2 F Cn+1/2 

2. Co£„_i/2 C CoBn-l /2 f= Co£„_|_i/2 

3 . CnQBnC Cn +1 

By [ 4 ] we have = Un>i'®n/2 = By [ 5 ] for n > 1 it holds that 

Cn+1/2 = Po1(co£„_i/ 2) and Bn+1/2 = Pol(coS„_i/2)- 



3 A Theory of Forbidden Patterns 

We consequently take up the idea of forbidden pattern characterizations and 
develop a general method for a uniform definition of hierarchies via iterated pat- 
terns in transition graphs. Such a definition starts with an initial pattern which 
determines the first level of the corresponding hierarchy. Using an iteration rule 
we obtain more complicated patterns which define the higher levels. Theorem 1 
states that a complementation followed by a polynomial closure operation on 
the language side is captured by our iteration rule on the forbidden pattern side. 
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3.1 Hierarchies of Iterated Patterns 

The known forbidden pattern characterizations for the levels 1/2 and 3/2 of the 
DDH and STH are of the following form: There appear two states si, S2 and a 
word z such that si — > +, S2 — > — and we find a certain structure between si 
and S2- Since in the following we consider only patterns of this form it suffices 
to describe the structures that occur between si and S2- Usually, both states 
Si and S2 have a loop of the same structure in the dfa, the loop-structure. This 
structure in turn determines the subgraph we need to find between si and S2, 
together with the loop-structure we call this the bridge- structure (cf. Fig. 1). Let 
us first define what we mean by an initial class of patterns. 

Definition 1. We define an initial patterni to be a subset of A* x A* such that 
for all r > 1 and v,w G A* it holds that (v, w) G I (u, v), (u’’, w ■ u’’) G I. 
For p = (v,w) G I and given states s, si, S 2 of some dfa F we say: 

— p appears at s s has a v-loop, and 

— si, S 2 are connected via p (in symbols si-^ S 2 ) p appears at si and S 2 , 
and Si S2. 




Fig. 1. The pattern for ^81/2 from [10] can be written as the initial pattern 
B = 2I+X2I+. Herep = (v,w) G 8 with loop-structure p' and bridge-structure 
P- 



Consider the initial pattern B = A~^ x A+ and some p = {v, w) G B. We 
interpret p as the structure shown in Fig. 1. In the initial case the loop-structure 
of p is simply a u-loop (cf. p' in Fig. 1), whereas its bridge-structure requires two 
states si, S2 both having a u-loop such that si S2 (cf. p in Fig. 1). We say 
that p appears at some state s if we find the loop-structure of p at this state. In 
contrast, if two states si, S2 are connected viap, then we find the bridge-structure 
of p between them. 

As a next step, we observe how the patterns for the levels 1/2 are used in 
those for levels 3/2. In case of the STH the reader may consider the results from 
[10] (with an appropriate rewriting of patterns), for the DDH we refer to Fig. 1 
and 2. They show that (i) the loop-structures p' of 81/2 patterns appear on 
the path si — > S2 in the 83/2 pattern and (ii) the bridge-structures pi of 81/2 
patterns appear as building blocks in the loop-structures of the 83/2 pattern. 
We formalize this observation as the following iteration rule. 
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Fig. 2. The pattern for from [6] can be written as Pf = IT(,B) with 
B = A~^ X A~^. Here p = {wo,po, wi,pi, . . . , Wm,Pm) G Pf with loop-structure 
p' and bridge-structure p. Moreover, Pi € B has loop-structure p' and bridge- 
structure Pi- 



Definition 2. For sets P let IT(P) := {{wo,po, . . . , Wm,Pm) : Pi G P, Wi G A+} . 

In the following we start with an initial pattern X and generate classes of iterated 
patterns by repeated applications of IT. 

Definition 3. For an initial pattern X we define P§ := X and P^+i := IT(P^) 
for n > 0. For some p = (wo,POj • ■ • ,WrmPm) G IT(P^) and given states s, si, 
S 2 of some dfa F we say: 

d©f 

— p appears at s there exist states qo, ro, . . . , qm, Tm such that 

Wo Po WI Pi W2 Wm Pm 

s — > go ro > qi ri — > > q^ Pm = s 

— si, S 2 are connected via p (in symbols si-^ S 2 ) p appears at si and S 2 , 

there exist states go, ■ • ■ ,qm such that si go gi qm = S 2 

and Pi appears at state qi for 0 < i < m 

Again, let us comment on this definition and see how we can understand it 
with the known results at hand (cf. Fig. 1 and 2). Consider the initial pattern 
B = A+ X A+ and some p G Pf. This means that p = (wo,Po, ■ ■ ■ ,Wm,Pm) for 
words Wi and elements pi G P® = B. The loop-structure described by p is a loop 
with factors of words wq,wi, . . . , Wm in this ordering such that between each Wi, 
Wi+i we find the bridge-structure of pi. Here we see how elements of Pg appear 
as building blocks in the loop-structure of elements of Pf . The bridge-structure 
of p connects two states si, S 2 such that we find the loop-structure of p at both 
of them. Additionally, it holds that si S 2 and after each prefix wq • • ■ Wi 

we reach a state at which the loop-structure of pi appears (= p- in Fig. 2). An 
example of the next iteration step for initial pattern B is given in Fig. 3. 

As mentioned at the beginning of this subsection we use the just defined 
patterns in the following way as forbidden patterns in dfa’s. 
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Fig. 3. Pattern P| = IT(Pf). Here p = (wo,po, G P| 

with loop-structure and bridge- structure p. Moreover, pj G Pf has loop- 
structure pi and bridge-structure pj. 



Definition 4. For a dfa F — (A, S', (5, sq, S'), an initial pattern F and n > 0 we 
say F has pattern P^ if and only if there exist si, S 2 E S , u, z E A* , pi (E W'n such 
that 6 {sq, u) — si, <5(si, z) E S', d(s 2 ,z) S' and si § 2 - 

3.2 Auxiliary Results 

To handle patterns p in a better way, we define a word ^ obtained from the 
loop-structure of p (call this the loop-word) and a word p which is derived from 
the bridge- structure of p (bridge- word). 

Definition 5. Let F be an initial pattern. For p = {v,in) E Pq let p := vj and 
:= V. For n > 0 and p = (wo,po , . . . , Wm.,Pm) £ ^n+i p := wq ■ ■ ■ Wm and 
pP wo^' ■ ■ WmPlT- 

The following is clear by definition. If p appears at some state s, then this state 
has a p^-loop, and if si and S 2 are connected via p then the bridge- word p leads 
from Si to S 2 - Moreover, for n > 1 and p = (vjQ,piQ, . . . , Wm,pm) G P^ we have 
p, G j 4+ , and if p appears at some state then also pm appears there. 

In order to establish a relation between the polynomial closure operation and 
the iteration rule, we isolate the main argument of the proof of Theorem 1 in 
Lemma 3 below, for which the following two constructions are needed. First, for 
every p G some A(p) G can be defined such that if p appears at some state 
s then s, s are connected via A(p) (cf. Definition 6 and Lemma 1). Secondly, in 
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Definition 7 and Lemma 2 we pump up the loop-structure of p to construct for 
given r > 3 some 7r(p, r) € such that for every dfa F we have 

(i) if two states are connected via p, then they are connected via 7r(p, r) and 

O 

(ii) if |_F| < r then 7r(p, r) and 7r{p,r) lead to states where 7r(p, r) appears. 

Definition 6. Let 2 be an initial pattern. For p = (u, w) G Pg let A(p) := (v,v). 
Forn > 1 andp= (wo,po,-- - ,Wm,Pra) G Pg let \{p) := (p°,A(pm))- 

The following lemma is easy to see by an induction on n. 

Lemma 1. For every initial pattern X, n > 0 and p G Pg we have A(p) G Pg. 
Moreover, if p appears at state s of some dfa, then s,s are eonnected via A(p). 

Definition 7. Let X be an initial pattern and r > 3. For p = (v,w) G Pg let 
tt{p, r) := {v'^'jW ■ u’’'). For n > 1 and p = (wo,po, ■ ■ ■ , Wm,Pm) G Pg we define: 

P- := ■n■{p^,r) 

w:=wq-p'q • pg • • • ^ • pg, 

n{p,r) := {wo ■ p'o ,pg, . . . , • ^,p'„, w , \{p'^) , . . . , w , X{p'^) ) 

' V " 

(r! — 1) times “w,A(pg,)” 

Again, it is immediate by definition that 7r(p, r) G Pg. Furthermore, a short 
observation makes clear that (i) if p appears at state s of some dfa, then also 
7 t(p, r) appears there and (ii) if the states si,S2 of some dfa are connected via 
p, then these states are also connected via 7r(p, r). In addition, we prove the 
following lemma. 

Lemma 2. Let X be an initial pattern, r > 3, n > 0, p G Pg, and let F be a dfa 
with |F| < r. 

O 

1. n{p,r) leads to states in F where 7r(p, r) appears. 

2. 7t(p, r) leads to states in F where 7r(p, r) appears. 

O O 

3. 7t(p, r) ,7 t(p, r) 7r(p, r) lead to states in F which are connected via 7r(p, r). 

Proof. We prove the lemma by induction on n. For n = 0 we have p = (v, w) 
and 7 t(p, r) = {v‘^',w ■ u’’'). Since u’’' leads to u’’'-loops in F, we obtain that 
7 t(p, r) = u’’' and 7r(p, r) = w ■ v'^' lead to states where 7r(p, r) appears. Hence 

O O 

7t(p, r) , 7 t(p, r) • 7r(p, r) lead to states which are connected via 7r(p, r). 

For the induction step let n = ^ -I- 1, p = (wo,po, ■ • ■ ,WrmPm) G Pf+i, and 
let w,pi as in Definition 7. First of all we show the following claim. 

Claim 1: leads to states in F where 7r(p, r) appears. 

Observe that leads to a w’’'-loop in F. So let s be a state in F that has 
a w’’’-loop, we will show that 7r(p, r) appears at s. Define the witnessing states: 

To := <5(go,_pg) 

Ti := 5{qi,p'f) for 1 < z < to 

rm+j ■= Qm+j for 1 < j < r! - 1 



qo := S{s,Wo-p'o ) 

qi := S{ri-i,Wi • p' ) 
q„i+j := 6{rm,w^) 
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So we have the following situation where m' := m + r\ — 1. 

'U'O-pjT’ Po 'UII-pT p'l 'U'2'P^ Pm 

s > qo > Po > qi > Pi > • • • > Qm > Pm 

w w w w 

'^m ^ — ‘^m+1 ^ Qm-\-2 — ^m+2 ^ ‘ ' ’ ^ Qm' — m' 



Therefore, from induction hypothesis it follows that qi,ri are connected via p[ 
for 0 < z < m. Moreover, the hypothesis also shows that appears at qj for 
TO + 1 < j < to', since p'^ is a suffix of w. From Lemma 1 it follows that qj, rj 
are connected via A(p^). Finally, by the definition of w we have Vm = S(s,w) 
and Pm' = S(s,w'^') = s. Hence we have shown the following. 

™ 0 'Po° Po tl'l-pT' p'l 1C2'pJ° ™m'Pilf° p'm 

s > go Po > gi Pi > • • • > qm ^ Pm 

w Hp'm) w Mp'm) w w Mp'm) 

Pm ^ qm+1 Pm+1 ^ ^m+2 Pm+2 ^ ^ 9m' Pm' — ^ 



So 7 t(p, r) appears at s which shows our claim. 

Since p'^ is a suffix of w, it follows that w leads to states where appears 
(induction hypothesis). From Lemma 1 we obtain that w leads to a A(p(,^)-loop 
in F. Hence Claim 1 also holds for (zc • \{p'^)Y' ^ . Now observe the following. 



7t(p, r) = wo-p'q ■■■Wm- pC ■ w"'- ^ 
T^{p,r)° = Wo-lY^ -p'o-'-Wm- ■ P'm ’ 




r! — 1 



o 

It follows that 7 t(p, r) and 7r(p, r) lead to states in F where 7r(p, r) appears. This 
shows the statements 1 and 2 of the lemma. 

Let us turn to statement 3 and choose an arbitrary state s of F. For si := 

O O 

(5(s,7t(p, r) ) and S 2 '■= S{s,Tr{p,r) ■ 7r(p, r)) we show that si,S 2 are connected 
via 7 t(p, r). Let to' := to + r! — 1 and define the following witnessing states. 

qo := S{si,wo • pg ) 

O 

qt+i := 6{qt, w^+i ■ p'+j ) for 0 < z < to 
g^+i := S{qj,w) for m < j < m' 

We have already seen that T^{p,r) appears at si and at S 2 . Observe that qm' = 
(5(si, 7 t(p, r)) = S 2 . So it remains to show that (i) p' appears at qi for 0 < z < to 
and (ii) \{p'm) appears at qj for to + 1 < j < to'. 

By induction hypothesis, p' leads to states in F where p' appears. Hence 
p' appears at state for 0 < z < to. Note that p(„ is a suffix of w. So the 
induction hypothesis shows that p'^ appears at qj for all j with to + 1 < j < to'. 
From Lemma 1 it follows that qj, qj are connected via \{p'^). Particularly, A(p(„) 
appears at state qj. This proves the lemma. 

Now we isolate the main argument of the proof of Theorem 1 . The following 
lemma says that under certain assumptions we can replace bridge-words by their 
respective loop-words without leaving the language of some dfa. 
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Lemma 3. Let X be an initial pattern, r > 3, n > 0, p G and let F be a 
dfa with |F| < r which does not have pattern P^. Then for all u,z G A* we have 

O 

UTr{p,r)z G L{F) UTr{p,r) z G L(F) . 

Proof. Suppose F = {A, S,S, sq, S') and p = {wo,po,... ,Wm,Pm) for suitable 
m > 0, Wi G A~^ and pi G P^. Let u, z G A* such that UTr{p, r)z G L(_F), and let 
p' and w as in Definition 7. Compare the following factorizations. 



T{p,r)^ = Wo ■ p'g -p'q- 


■■Wm-p'm ■ PL ■ {w ■ X{p'^)^ 


(1) 




Q / \ r! — 1 




7t(p, r) = Wo • Po • • 


■■Wm-p'm ■ [w j 


(2) 



We already know that (i) leads to a state in s where appears and that 
(ii) su ch a st ate has a connection with itself via \{p'm)- K follows that p^ leads 
to an A(pJ„)-loop in F, which in turn implies that also w leads to such a loop. 
So taking (1) and (2) into account, it remains to show the following for all 
u',z' G A*. 

u'^ z' G L(F) • p' • z' G L(J^) 

Suppose the contrary and let si := 6{sq,u'p'^ ), S 2 := S{so,u'p'^ ’Pi)- By 
Lemma 2.3 we know that si, S 2 are connected via p'. Since S{si,z') is accepting 
and S{s 2 ,z') is rejecting, we have found pattern P^ in F, a contradiction. 

3.3 Pattern Iterator versus Polynomial Closure 

The proof the following theorem can be carried out with Lemma 3. It says that 
pattern iteration captures complementation followed by polynomial closure. 

Definition 8. Let X be an initial pattern. For n > 0 we define the class of 
languages corresponding to P^ as 

:= {LCA+: L is accepted by some dfa F which does not have P^} . 
With Lemma 2 one can show that this is well-defined. 

Theorem 1. Let X be an initial pattern and n >0. Then Pol(col^) C . 

Proof. We assume that there exists an L G Pol(coL^) \ this will lead to 

a contradiction. Let F = (A, 5, (5, sq, S") be some dfa with L(J^) = L. Since 
L G Pol(coL^), we have 



k 

L = Li ^Li x ' ' ' Li }i,^ 
i=l 

for languages Lij G col^. Choose r > 1 sufficiently large, i.e., larger than k, 
ki and the size of some dfa’s accepting L, Lij and the complement of Lij. 
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Since L ^ there exist states si,S 2 G S and words u,z G A* such that 

si,S 2 are connected via some p G P^+i, S(so,u) = si, (5(si,z) is accepting 
and S(s 2 ,z) is rejecting. It follows that si,S 2 are also connected via T^{p,r) and 

u{n{p,r) ^ 2 G for some 1 < i' < k. Since r > ki', the 

O 

latter word can be factorized as u”u'Tr{p, r) z' z” such that u" G ■ ■ ■ Lj/ j/_i, 

O 

u'n{p,r) z' G Liiji and z” G Li/ji+i ■ ■ ■ for some j' < ki^ . Because there 

O O 

is a dfa of size < r accepting Liiji, the word 7r(p, r) leads to a 7r(p, r) -loop in 
this dfa (Lemma 2.1). Hence for alH > 1 we obtain 

u' (n{p,r) ) 2 ' G (3) 

O 

Moreover, u'n{p,r) Tr{p,r)z' ^ Li'j', otherwise we would obtain 

/ 0\j / 

ulTr{p,r) I 7t(p, r) ( 7 t(p, r) j zGL 



for some j < r, which in turn implies the contradiction U7r(p, r)z G L (recall 
that si, S 2 are connected via 7r(p, r) in F). Observe that some dfa accepting the 
complement of Li^ji is of size < r and does not have pattern P^. From Lemma 3 

O O 

it follows that u'Tr{p,r) 7r{p,r) z' ^ ^i' ■ This contradicts (3). 



3.4 Hierarchies, Decidability, and Starfreeness 

Let X, J be initial patterns and i, j > 0. We say that any pattern from Vj can 
be interpreted as a pattern from if and only if for every p G Vj there exists 
a p' G Pf such that (i) if p appears at state s of some dfa, then also p' appears 
at this state and (ii) if the states si,S 2 of some dfa are connected via p, then 
they are also connected via p' . An easy induction shows that if any pattern from 
PG can be interpreted as a pattern from PG, then C for all n > 0. 

Particularly, if any pattern from ¥\ can be interpreted as a pattern from P§ 
(which is a weak assumption), then we obtain C for n > 0. Together 

with Theorem 1 this yields U col^ C n coL^_f_i . 

Let us turn to the decidability of pattern classes. It is reasonable to consider 
initial patterns X such that for every fc > 1 there exists an algorithm Ak which 
does the following in nondeterministic logspace NL: On input F, k states of F 
and k pairs of states of F it decides whether there is some p G X appearing at 
each of the given single states and connecting each of the given pairs. For n > 0 
this leads by induction to an NL-algorithm for the membership problem of . 

The pattern iterator IT can be considered as a starfree iterator. Let X be 
an arbitrary initial pattern and recall that SF denotes the class of starfree 
languages. One can show that for n > 1 it holds that X SF if and only if 
Ui>o ^ (actually this does not hold for n = 0). 
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4 Consequences for Concatenation Hierarchies 

From now on we consider two special initial patterns. With C := {e} x A* and 
B := A+ X 4+ we meet the known forbidden pattern characterizations for £1/2 
and Bi/2 from [10]. Furthermore, if we compare the characterizations for £3/2 and 
B3/2 from [10] and [6], respectively, we observe that £3/2 = Lf and B3/2 = Lf . 
From the results of the previous section we obtain for the pattern classes the same 
inclusion structure as it is known for the concatenation hierarchies in question 
(see Propositions 2 and 3). Moreover, it follows from Theorem 1 that the pattern 
classes contain the respective levels of the concatenation hierarchies (cf. Fig. 4). 



Theorem 2. For n > 0 the following holds. 

1. L([ U coL(] C n coL(]_|_;^ 

2. L® U coL® C n coL®_|_;^ 

5. L([ C L« C 



Theorem 3. For n>Q it holds that £n+\ji fk and Bn+ 1/2 Q 



starfree 




Fig. 4. Concatenation hierarchies and forbidden pattern classes. Inclusions 
hold from bottom to top, doubled lines stand for equality 



The pattern hierarchies even exhaust the class of starfree languages. 
Theorem 4. It holds that 




514 



Christian Glafier and Heinz Schmitz 



Next, we want to show the strictness of {L^} and {L^} in a certain way, 
namely we take witnessing languages from [16] (see also [2]) that were used there 
to separate the classes of the DDH. As remarked in [16], these languages can also 
be used to show that the STH is strict. W.l.o.g. we assume that A = {a,b}. Let 
us recall the definition of a particular family of languages of A+ from [16] . Denote 
for w G A~^ by |w|a the number of occurrences of the letter a in w. Now define for 
n > 1 the language L„ to be the set of words w G A+ such that |w|a — lw]h = n 
and for every prefix u of w it holds that 0 < (|u|a ~ I'l'jb) < n. It was shown 
in [16] that G Bn\Bn-i- So with Bn C Bn+ 1/2 we get from Theorem 3 that 
Ln G L®. We can prove that the minimal dfa Fn with L„ = L{Fn) has pattern 
P®. It follows that L„ G L® \ L®. 

Theorem 5. Let n> 1. Then the following holds. 

1. L®_i CL® andBn+i/2 %K-i- 

2. C L® and £„+i /2 % . 

We adapt the well-known algorithm that solves the graph accessibility problem 
to show that the initial patterns T, and B allow algorithms Ak as mentioned in 
subsection 3.4, so the membership problems for L® and L® are decidable in NL. 
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Abstract. Church-Rosser languages are mainly based on confluent 
length reducing string rewriting systems. In general, the prefix language 
of a Church-Rosser language may not be describable by such a system, 
too. In this paper it is shown that under certain conditions it is possible 
to give a construction for a system defining the prefix language and to 
prove its correctness. The construction also gives a completion of prefixes 
to full words in the original language. This is an interesting property for 
practical applications, as it shows potential for error recovery strategies 
in parsers. 



1 Introduction 

The Church-Rosser languages (CRL) are a relatively new class of languages. 
Basically they are defined by confluent and length reducing string-rewriting sys- 
tems with a distinction between terminals and nonterminals and the possibility 
to mark word ends. They were defined by McNaughton, Narendran, and Otto 
in [MN088] and are the deterministic variant of the growing context sensitive 
languages defined by Dahlhaus and Warmuth in [DW86]. This was proved by 
Niemann and Otto in [N098]. On the one hand, they have some nice properties, 
e.g. solvability of the word problem in deterministic linear time, closure against 
the mirror operation, and they are a superset of the deterministic context free 
languages (detCF). On the other hand, they are a basis of the recursively enu- 
merable languages. That means, given an alphabet if ((f, U ^ S) and any r.e. 
language L C E* there is a CRL L' C S* ■ {((} • {ji}* so that deleting the letters 
(t and U with a homomorphism h which leaves letters of E unchanged leads to 
h{L') = L (see also [OKK97]). Therefore there are prefix languages of CRT’s 
which are not CRT’s themselves. So a natural question is, given a rewriting sys- 
tem for a CRL, under which conditions a new CRL system can be constructed 
and proved to be correct that delivers exactly the prefix language. In this article 
it is shown that under some restrictions for the CRL system this is possible. 
There also is a practical aspect. Although the theory of parsing programming 
languages seems to be fully elaborated (with [Knu65] being a turning point), its 
main coverage is the parsing of correct programs. Even relatively new compiler 
generators like Cup [App98] do not provide much help to produce useful output 
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in the error case. Most error recovery strategies are either relatively simple or 
they are very complicated and time consuming [SSS90], [App98]. The prefix 
construction for CRL given in this paper offers some new potential. Whenever 
the sufficient conditions are fulfilled, the construction does not only produce a 
CRL system for the prefix language — it also gives a method to deduce correct 
completions to a word of the language.^ This is more than the prefix closure 
proof of detCF (see for example [Har78]) delivers. 

This paper is organised in the following way: The next section gives the necessary 
basic definitions and a technical result about a normal form for CRLS’s. Also, the 
construction for socalled prefix systems is introduced. The third section contains 
some examples for the effects of the construction and how prefixes are accepted. 
The fourth section goes into the details of the correctness problem in a slightly 
informal way and adds the construction of an enriched version of prefix systems. 
In the fifth section the main result is stated and proved. 

Because of the limited space this text can only contain basic ideas. It tries to give 
an overview which allows to assess the theoretical value of the results. Proofs of 
the theorems are rather technical and lengthy. Because of this they are omitted or 
only briefly sketched. Full proofs can be found in the technical report [WoiOOb]. 



2 Basic Definitions and Ideas 



The following definitions are mostly necessary to identify the notations used 
throughout this paper. See also [Har78], [Jan88], [B093], and [B098]. 



Let be a finite alphabet, S* denotes the free monoid over S, for the empty 
word we write □. A subset L C E* is called a language. If w is a word of length n, 
we write |w| := n. To address single letters of w we use w = a\ • • • ai • • • an, ai € 
L'(l Then Pref(w) (Suff(w)) is the set of prefixes (suffixes) of w. With 

L being a language, Pref(L) := U Pref(w) and Suff(L) := U Suff(w). 

w^L w^L 



A string-rewriting system (or simply rewriting system) i? on A is a subset of 
S* X A*. For (u,v) G i? we also write {u ^ v) G R and call (u,v) a rule. 
The rewriting relation between words in A* is defined as — >:= {(swt, sut)|s, t G 

Ft 

E*,(u,v) G Rj. The reflexive and transitive closure is denoted with — >*. A 

Ft 

word w G A* is called irreducible modulo R if there exists no w' with w — > wL 

R 



The set of all irreducible words of R is denoted with Irr(R). 



A weight function is a function / : A ^ N. It is recursively extended to a 
function on A* by f{wx) := f{w) + f{x) and /(□) := 0 with w G A*, x £ S. 
An example for a weight function is the length function with f{x) := 1 for all 
a; G A, then f{w) = |w|. A string-rewriting system R will be called a weight 
reducing system, if there exists a weight function / so that f{u) > f{v) for all 
{u,v) G R. 



This already has been implemented in a CRL development system [RotOO]. 



1 
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A string-rewriting system R is confluent if for all w,wi,W 2 , w — >* wi and 

R 

w — W 2 there exists a W 3 € lRR(i?) so that wi — wr and W 2 — wr. If 
R R R 

so, and if i? is a weight reducing system^, W 3 is unique and is called irreducible 

normal form of w (and, of course, w\ and 102 )- For a word w we denote its 

irreducible normal form with [w]_r. 

Definition 1. A Church-Rosser language system (CRLS) is a 5-tuple C = 
(F, S, R, ki, kr, y) with finite alphabet F , terminal alphabet S C F (F \ F is the 
alphabet of nonterminals), finite confluent weight reducing system R C F* x F* , 
left and right end marker words ki,kr G (F \ E)* n Irr(F), and accepting let- 
ter y G F f] Irr(F) The language defined by C is defined as: Lq '■= {w G 
E*\ki -w - kr ^*R y} 

A language L is called a Church-Rosser language (CRL) if there exists a CRLS 
C with Lc = L. We say that a word w is accepted by C ifwG Lq, thus stressing 
the fact that a CRL is defined by a reduction process. 

To address the rewriting system R of a CRLS C, we use Rew(C'). 

The definition of Church-Rosser languages is due to McNaughton, Narendran, 
and Otto [MN088]. The definition of Church-Rosser language systems given 
here is a convenient notation for their definition. Nieman and Otto proved in 
[N098] that the expressive power of Church-Rosser languages is not enhanced 
by allowing arbitrary weight functions instead of the length function, so this fact 
is used, too. 

In order to be able to give a prefix construction for CRLS’s, some restrictions 
will be made. The first observation is that detecting the left and right end of a 
word is relatively difficult because of the arbitrary end marker words ki and kr- 
In consequence, only single letters will be used: ki = g and kr = %■ Furthermore 
it will be required that these are not changed, removed, or added throughout the 
reduction process. Only if the word is accepted they will be deleted. Secondly, 
we will require all rules to be of a limited form which makes some operations 
easier. This form is inspired by the shift-reduce automatons for detCF languages. 
Furthermore it is very similar to well known forms of context-sensitive grammars. 

Definition 2. A CRLS C = {F, E, R,g,$,y) is prefix splittable (C is a psCRLS) 
if (t, $, y G Irr(F) n F \ F (let the inner alphabet be Finner ■= T \ {$) y}) an-d 

for any rule r G R there exists a splitting (u, v, w, x) with: 

F r = {uvw, uxw) 

2. V is non-empty. 

3. uvw may contain at most one (p and if so at its beginning. Also it can have 
at most one $ which only may appear at the end. All other letters of uvw 
have to be from the inner alphabet Firmer- 

^ Or otherwise terminating, but this will not be considered in this text. Also, in case 
of terminating rewriting systems, local confluence implies confluence. 
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4- X is a single letter not equal to ({ or $ or it is the empty word. 

5. If V contains a q or $, then x = y, u and w are empty, and v is of the form 

6. If X = y, then u and w are empty, and v is of the form (p • /^rmer ' 

The splitting (u, v, w, x) of a rule r allowed by this is called a potential prefix 
splitting, u and w are called the left and right context, respectively. 

Also this definition seems to be rather restrictive, it is possible to show that it 
is a normal form for CRLS’s. We only state the result because the proof for this 
is not in the scope of this text: 

Theorem 1. Let C = {T, S, R, ki, kr, y) be a CRLS (without restriction of gen- 
erality let R be length reducing [N0981) with language Ln. Then there exists a 
psCRLS C with Lc = Lc. 

Note on proof. The main idea is to use a compression technique. That is, an 
alphabet of non-terminals which can store more than one letter (of the input 
or of intermediate reduction results). The biggest problem is to ensure that the 
new system is weight reducing. This can be done by spreading weights over more 
than one of these compression letters. In order to achieve this it is necessary 
to simulate single rules by chains of rules that are linked to each other in a 
way delivering confluence. This also requires a system where at any time the 
place of the next possible reduction can be uniquely identified. Working on a 
compression alphabet, the restriction of the end marker words to special letters 
is a very simple problem. For more details see [WoiOOa]. 

Remark 1. A rule may have several different prefix splittings. For our investiga- 
tion it will not matter which prefix splitting we choose. Because of this situation 
we just choose a prefix splitting arbitrarily (see also [WoiOOb]). 

The idea of constructing a prefix CRLS to a psCRLS is very basic: Simply cut 
off suffixes of rules. To be precise, some efforts are necessary to handle the right 
end of words. Given any unique definition of prefix splittings, a prefix system is 
defined as follows: 

Definition 3. Let C be a psCRLS, r e Rew(C') with prefix splitting {u, v, w, x), 
u = ai • • • a* • • • a|„|, Oi G T, w = 6i • • • 6* • • • b\^\,b^ G T. 

The prefix rules of r (pREF(r ) ) are defined as: 

PREF(r) := ( {{uvbi ■ ■ ■ bj$,uxbi ■ ■ ■ bj$)\0 < j < \w\} 

^ |{(ai • • •aj$, 2/)|0 < j < |u|} x = y 
1 {(uoi • • • Qj$, rta:$)|0 < j < |u|} else 

) 

\{(w, w)\w G r*} 
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Define the following rewriting system R' : 

R' = R\J [J PREF(r). 

r&R 

If R' is weight reducing and confluent (which is decidable) then the CRLS C = 
{r,S,R' ,g,%,y) is called prefix system of C, C = Pref(C'). 

We also use R' = PREF(i?) in that case and call R the origin of R' . If R' is not 
confluent or not weight reducing PREF(i?) and Pref(C') are not defined. The 
reason for this will be discussed later. The process of building PREF(i?) is called 
prefix construction. 



3 Some Examples 

The first example will be used throughout the rest of this paper to show the 
effects of the prefix construction: 

Example 1. The psCRLS C is defined in the following way: Let S = {a, b, c, d, e}, 
r = E U {$, (p, y} with left and right end markers <p and $ and accepting symbol 
y. Let the rewriting system R be defined as follows. We mark a prefix splitting^ 
with concatenation dots these dots do not belong to the rules: 



■abc- - 


-b- 


■abb'bc - 


-a-bc 


■bb-% - 


-b-% 


■db- - 


-b- 


•e-c - 


-b-c 


•(p6$- - 


-y 



In figure 1 the prefix system R' of C' = Pref(C) is given. The parts appearing 
in brackets and the double rules may be ignored at this point. They contain 
information about the parts cut off which will be used later. 

Since it is easily verified that R' is confluent and weight reducing we omit this 
here. Now have a look at a prefix of the word adabbdecc G Lc and its acceptance 
in C' . Observe the mixture of rules already in R and those new rules from R'\R: 



gadabbde$ gadabbdb$ gadabbb$ gadab$ gadb$ (pa6$ ^ gb$ y 

ri2 rg rg rg rg T2 ri3 

This seems to work fine, but we cannot always be sure that a prefix system is 
doing what we would expect it to do. Regard these three examples: 

^ Note that this choice is not necessarily optimal, but here, this will not be discussed 
further. 
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ri 


abc 




6 


[□/□] 


r 2 


ab$ 




6$ 


[c/a] 


rs 


a$ 




6$ 


[bc/a] 


T 4 


abbbc 




abc 


[□/□] 


rs 


0666$ 




ab% 


[□/c] 


re 


ab6$ 




a$ 


[□/6c] 


r? 


a6$ 




a$ 


[6/6c] 


rs 


66$ 




6$ 


[□/$] 


rg 


db 




b 


[□/□] 


no 


d% 




6$ 


[6/D] 


m 


ec 




be 


[□/□] 


ri2 


e$ 




6$ 


[D/c] 


ns 


(p6$ 




y 


[□/$] 


ri 4 


(p$ 




y 


[6/$] 



Fig. 1. The prefix system R' for example 1 . 



Let the psCRLSs be C\,C2,Cz- Let S = {a, 6, c} and F = {a, 5 , c, (p, $, y} 
be the common alphabets of the three systems. Build prefix rewritings systems 
R'l, i?2; -^3 with the prefix construction. 



Example 2 . R\ = {(a, b), {ba, a), {bb, a), (<pa$, y)} 
R[ = RiU{{b%,a%),{<^%,y)} 

R'l is not weight reducing: a$ — > 6$ — > o$ 



Example 3 . i?2 = {a66$, a$), {bbc, d), ((pa$, y) 
i?2 = ^2 U {ab$, a$), (66$, d%), (6$, d$), ((p$, y)} 
i?2 is not confluent: 

(pa66$ — y G lRR(i?^ and <l:a66$ — (padS G lRR(i?2) 

R2 R2 



Example 4 - -R3 = {(<pZ?$, y)} 

i ?3 = i?3 U {((p$,y)} 

Lcs = 0 but Lc' = {n}. 

These examples lead to the following definition and a result stated as remark. 
They also answer the question why we required R' to be confluent and weight 
reducing in the definition of Pref(C'). 

Definition 4. Let C he a psCRLS. Then Pref(C') is correct if and only if it is 
defined and Lpref(C) = Pref(Lc)- 



Remark 2 . There are psCRLS C so that Pref(C) is not defined or not correct. 
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On the other hand we have a positive result: 

Theorem 2. Let C be a psCRLS and C = Pref(C) the prefix system with R' 
as string-rewriting system of C . Then Pref(Lc) O Lc ■ 

The proof is a simple induction over the length of reductions in C. 



4 What Is Happening in Prefix Reductions? 

A closer look at the rules of psCRLS reveals the different roles of the parts of 
the prefix splittings. Let r be a rule with prefix splitting (u, u, w, x). Then in the 
prefix construction the left context is never deleted. In contrast to this, parts v 
and w are deleted. Still those two differ in their meaning, because by applying r 
the information of v is always lost (resp. substituted by x), whereas w remains 
unchanged. Obviously, this is important for accepting correct prefixes. In order 
to use this, we now define expanded prefix CRLSs which have 4-tuples instead 
of pairs as rules. In these, the first two components have the same meaning as 
in usual rewriting systems, including the relation In the other two we store 
what has been cut off during the prefix construction: 

Definition 5. Let Rbe a rewriting system defined on the alphabet T, and a G T. 
Then R\a is the subsystem of R that is obtained by removing all rules eontaining 
the letter a. 



Definition 6. Let C be a psCRLS with rewriting system R and r G R with 
prefix splitting (u,v,w,x), v = ai - 02 ••• Oi ••• Oi G T, w = bi-b 2 bi • 6|t„|, 
bi G r . Lf R\% is eonfluent and Pref(C) is defined, we define the set CPREF(r) 
of completion prefix rules of r (or, shorter, completion rules): 



CPREF(r) 



( {{uvbi ■ ■ ■ bj%, [uxbi ■ ■ ■ □,6j+i • • • 6|u.|)|0 <j< 



U 



j {(ai • ■■aj%,y,aj+i ■ • • a|„|_i, $)|0 <j< |u|} 
|{(Mai • • • • a|„| , w)|0 <j< |u|} 



\w\} 



x = y 
else 



U{(uuw, uxw, □, □)!««; ^ r* ■ $}^ 

) 

\{{w' ,w' ,u' ,v')\w' ,u' ,v' G r*} 



— If i? \ $ is not confluent, CPREF(r) is not defined.® 

— Reducing the second component with all rules of R that do not work at the 
right end of words (which R\% means) does not change the defined language. 



^ This set is empty or a singleton. 

® It should be possible to avoid this restriction but this is not in the scope of this 
investigation. 




Prefix Languages of Church-Rosser Languages 523 



— The third component of the completion rules is called consumed completion. 

— The fourth component is called unconsumed completion 

— The system R' is called eompletion prefix system of R with i?' = 

CPREF(r). With REw(i?') we denote the extraction of the first two 
components, which has a rewriting system as result. Note: If Pref(C) is 
defined then REw(i?') = Rew(Pref(C')). 

— In the same manner, CPref will be used for the completion prefix CRTS 
given by the alphabets and accepting letter of C and R' iff Pref(C) is 
defined. 

— Rpref ■= \ {(“> v) G E* X E*} is the set of rules from R' with 

nonempty completions. 

— The set of all (old) rules that do not work at the right end can be identified 
with R' \ % ■.= R' \ Rpref- 

— Cpref is the completion prefix which is given by the expanded rewriting sys- 
tem i?pj.gf and the remaining parts being identical to those of C . 

— C" \ $ is defined accordingly. 



Remark 3. By using Rp^ef ^e can speak of all newly generated rules. On the 
other hand R' \ $ are all old rules that do not work at the right end of words. 
Those old rules that do work at the right end of words will have a representant 
in ^pref 

Now we can explain the brackets in figure 1 . The words left of the slashes are 
consumed completions. They will be of no importance for further reductions. 
Right of the slashes are the unconsumed completions. They play an important 
role for the correctness of prefix systems: they link applications of prefix rules. 
To explain this, we have a closer look at a part of the above reduction. Below 
the ^ we write the bracket expression of the respective rules: 

gadabbde$ — > (tadabbdb$ — > gadabbb$ — > (tadab$ — > (tadb$ 

[n/c] [□/□] [n/c] [c/n] 

The application of rule ri2 means: “guess that the next letter would be a c and 
that it belonged to the unchanged right context.” Then rule rg is used. Since 
this is an old rule, it does not change the guess of a completion. After that, rule 
rs is used. It again assumes an unchanged c to be the next letter of a possible 
completion to a correct word. The application of rule rs fits to that of ri2 which 
assumed the same. Now rule r2 is used. Here the c is still the same letter, but is 
part of the consumed completion. This means in other words: “we guessed the 
next letter to be a c, this guess was correct, and now it is completely used, so 
we do not need to consider it further.” 

In contrast to this, consider using rule rg twice instead of rg. This rule guesses 
the end of the word. So its unconsumed completion will not fit to the completion 
of T2. This is of major importance. Because of rg, Lc would also contain abbbb 
which clearly is no prefix of a word in Lc- 
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This has two consequences: (a) we have to find a method to check such cases 
and (b) an extension of remark 2: 

Remark 4- There are psCRLS C without true nonterminals, i.e. T = i7U{(i:, $, y}, 
so that Pref(C) is defined but not correct, which means Tpref(C) Pref(Lc:). 
So, a natural question is, if it is possible to determine whether or not the system 
Pref(C) is correct. This will be discussed in the next section. 



5 Main Result 

Our first step is to introduce a way to store information about the interaction 
between old rules and new rules and the involved completions. 

Definition 7. Let C he a psCRLS and an expanded CRTS C = CPref(C) 
with expanded rewriting system R' . A set K of 4~tuples is ealled candidate set 
of C", if for all u = {ui,U 2 ,uz,ua) G K the following holds: 

1 . u\ ends with $: ui € F* ■ $ 

2 . U 2 is the accepting letter or ends with $.• M 2 G ({y} U (F* ■ $)) 

3. U 2 is irreducible w.r.t. all old rules that do not work at the right end: U 2 G 
Irr(R' \ $) 

4- Us is built from the inner alphabet: us G {F \ {(p, $,?/})* 

5. U 4 is mainly from the inner alphabet, yet it may have a% at its end: 

U4 G (T\ {(p, $,?/})* • {□,$} 

6 . a) u\ can he reduced to U 2 using only old rules that do not work at the word 

end: ui — U 2 
fi'\$ 
or 

b) u itself is a new rule: u G Rp^ef- 

The elements of candidate sets are called candidates. In the above case 6 a u is a 
representative of a reduction with i?' \ $. We also call u a reduction candidate. 
In case 6 h we call u a rule candidate from R'pj.gf- 

Now we want to know if two chains of reductions on partial words, each repre- 
sented by a candidate, can happen after each other: 

Definition 8. Let C be a psCRLS, C = CPref(C'), K a candidate set of C 
and u = {ui,U 2 , us, U 4 ), v = {vi,V 2 ,V 3 , V 4 ) G K 

We say u allows v in K with rest w, u \~k,w v, if one of the following conditions 
holds: 

(i) U 2 and v\ overlap so that their right ends are matched together, then w is 
empty: 

w = O and v G and U 2 G Suff(ui) V G Suff(u 2 ) 
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(a) U2 and v\ do not overlap in that way but one of the old rules not working 
at the right end of words can he padded at the right end of its first compo- 
nent with a word w so that an indirect overlap (via reduction) is possible. 
Furthermore the first component of this old rule has an overlap with U2 
that reaches into the part of the word which is changed by the reduction 
represented by the candidate u: 
w □ and 3 (u^, W2) € Rew(C \ $), v' € F* so that 
|ui| - |w'| > lcp(ui,U2) and vi = v[w = v'u2 and V2 = KHrew(C'\$) • 

With 1 cp(mi,M 2 ) we denote the length of the longest common prefix ofui and U2- 

In figure 2 the second alternative of the definition, which is more complicated, 
is illustrated. Irreducible parts are set in boldface. 



v' 


w'l 


Ui 


\u'\ = 1cp(mi,M2) 



(mi, M2) 



v' 


m' I 1*2 


v'l 


w 


Vi 



(m(,M2) 



v '2 


w 







C\$ 




V2 



Fig. 2 . The allows relation \~k,w with w ^ O. 



The last step is to check if the completions appearing in these reductions fit 
together. 

Definition 9 . Let C be a psCRLS, let C = CPref(C'), K a candidate set of 
C , and u = {ui,U2, U3, R4), R = (ri, ^2, R3, R4) G K. 

We say u allows v in K and with rest w and correct completion if U4 G 
Pref(u3f4) and u \~k,w v. 

That means the reduction represented by v may safely be applied after the one 
represented by u since the rest of the completion left by u fits to the completion 
ofv. 
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Notation: u >~k,w v.Also, if we are not interested in w: u >~k v, then and 
(^k) denote the transitive (reflexive and transitive) closure. 

Ifu v and u, v are rule candidates from 

and if no u' exists which is a rule candidate from R'pref> so that u v 

we say u directly allows z; in it' with correct completion, short u v. 

The definition of u v does not distinguish how many prefix rules are used 
between u and v. By u v we can make sure that v can be reached from u by 
using exactly one prefix rule. 

Example 5. (for definition 9) There exists no candidate set K so that rg r 2 - 
There exists & K so that rg r^- 

Now we can define sets of correctly working right end reductions. This defini- 
tion is inductive. In order to understand this definition it helps to think of the 
reductions backwards, i.e. from the accepting y to the start of the reductions. 

Definition 10. Let C be a psCRLS, C = CPref(C), W a candidate set of 

C. 

W is a working set of C if for all u = {u\,U 2 , Ms, M 4 ) € W one of the following 
conditions holds: 

(i) U 4 = □ (Fully consumed completion means that the next reduction(s) may 
be applied without regard of the completion; of course only until a new 
unconsumed completion appears.) 

(a) M4 = $ A U2 = y (After an accepting rule where only the word end marker 
is left as completion no further harm can be done. The reduction is finished 
... ) 

(Hi) for all v = (vi,V2, us, V4) G W,w G E* with u \~w,w v 
exists v' = (mi, M2, M3, M4) G W with u >~w,w v' 

(Whenever a reduction with v after u can take place, there is a variant v' 
of V with fitting next completion that does exactly the same w.r.t. to the 
rewriting relation.) 

The set of all working sets ofC is denoted Working(C'). Since the definition 
is inductive, an algorithm can be given to compute or at least enumerate working 
sets. For details, see [WoiOOb]. 

The following lemma shows that working sets can be used to check the correct- 
ness of prefix systems: 

Lemma 1. Let C be a psCRLS, with E = E U {(]:,$, y}, and C = {{ui,U 2 ,uz, 
M4)|(mi, M 2 , M3, M4) G CPref(C),M 2 = [m 2 ]c\$} • Let W G Working(C") and 
C" be a subset of C so that for all rules r G Rew(C") either r G Rew(C' \ $) 
or r G Rew(C") n Rew(IT). Let w G S* be accepted by C" : (tz«$ y. Then 

there exists a w G E* so that gww$ — y. 

c 
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Proof. Since for any reduction in a confluent weight reducing system there exists 
a right canonical reduction and because of the properties of working sets there 
is an n > 1 and a reduction of the form: 



riGREw(C" ) C\$ ^ 26 Rew(C- ) 



riGREw(C"3f) 

> W„ > y 

r„_iGREw(C" ) C\$ r„GREw(C" ) 



With Ti n+i or n = {u, V, s, □) for all 1 < z < n. 

We show the lemma with an induction over n. 

Basis, n = 1 : Then rule r„ is accepting with consumed completion s (possibly 
empty) and unconsumed completion $. So zc = s is the correct completion. 
Claim. Let the lemma hold for n > 1 
Induction step, (n ^ n + 1) 

We can And w' , w'{, and wf so that w = zci = and w[ = (tzc'/'S. 

There are three cases, we will only show the proof for the first one, the other 
two cases are similar: 



Case 1: ri = (zz$, u$, s, □) With induction claim there always exists a completion 
t with 

(tw'{'t$ y. There exists r[ G Rew(C) with r[ = (us,v). This leads to 

(fw' — (fzc" and in consequence <tw's — (tw'fs — > (pzc"'. So, (fw'st — 

O O O 

(pwfst — > (tw'f't. Now, iu = st is the correct completion: <pw'st$ — y. 

C c: 

Case 2: ri >~w and r\ = {u, v, s, s') with s, s' G (T \ {$})* (analogous to case 

1, because of ri V 2 we know that s' G Pref(t) and therefore st is the 
correct completion). 

Case 3: ri ^2 and ri = {u, v, s, s'$) with s, s' G (T \ {$})* (analogous to case 

2, st is the correct completion). □ 



This directly gives the following result: 

Theorem 3. Let C he a psCRLS, with P = SU {(p,$,z/}, let C = CPref(C) 
(which is equivalent to Pref(C) w.r.t. to the defined words), and W G 
Working(C'). If for all r = (u,v) G REw(C'p^gj) the condition {u,v) G Rew(W) 
holds, then CPref(C) is correct, and therefore also Pref(C') is correct. 

Final remark. Obviously, these results can be used for a similar suffix construc- 
tion. The problem of true nonterminals (Tinner ^ B) cannot be discussed here 
for lack of space. 
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6 Conclusion 

We have shown that under certain conditions it is possible to give an effective 
construction for CRLS’s defining the prefix language of a CRL. Due to the fact 
that prefix splitable systems are a normal form this syntactical restriction is not 
a hinderance to this on its own account. But because the CRL’s are a basis for 
the r.e. languages there still will be languages whose prefix systems cannot be 
correct. To be precise, chances are high that the correctness of prefix systems 
is undecidable. Furthermore, there could be CRL’s whose prefix languages are 
CRL’s but for which the construction given here fails on any system defining 
them. Another problem arises from true nonterminals, that is such letters in F 
that are neither end markers nor the accepting letters. These questions show a 
line of further research in the theoretical aspects of Church-Rosser prefix lan- 
guages. 

From the practical point of view, one might ask if more than “toy” languages 
are possible. In [RotOO] this is answered in the positive. In this diploma thesis a 
substantial subset (i.e., covering main syntactical problems) of Java syntax has 
been described with a CRLS for which the prefix construction gives a correct 
system. One possibility to incorporate this into usable software tools would be 
to design something like “hybrid” LR(/c)/CRL-compilers. 

Altogether one can conclude that under theoretical as well as under practical 
aspects prefix languages of CRL’s are worth future investigation. 
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